Keyboard shortcuts

Press ← or → to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Description of the data collection process šŸ“‹

DCWatch proposes a methodology to collect data from sources accessible to the public.

This methodology has to be adapted to specific contexts (each country may require different data sources for example).

This methodology is absolutely not the only one relevant , feel free to propose a new one and explain it, by opening a new issue.

Warning

Remember that DCWatch is based on sourcing every data collected. Keep the urls of the websites or articles you gen informations from and include them in your contribution.

1. Datacenter identification

First you need a lead to follow on the possible existence of a datacenter, meaning, at least:

  • An approximative location: a country, a region/land or a city name would be better.
  • The name of a company that is supposed to operate, own, construct or even design the datacenter.

2. Find the Datacenter precise location

Based on the information you have, find the precise location: and address, or GPS coordinates.

Most of the time an address is easier to find first. Here is a list of methods you can try to get that information:

If you know its name

Search for it on common geolocation applications: Google Map, OpenStreetMap, Yandex Map

If you don’t know its name, but have an approximatie location

The local press surely has released some articles on its inauguration. Using classical search engines, search for ā€œdatacenter CITY_NAME OPERATOR_NAME inaugurationā€ (preferrably translating it in the concerned language). CITY_NAME being replaced by the name of the city the datacenter is supposed to be in and OPERATOR_NAME being the name of the company that is supposed to operate the facility.

Those kind of articles often mention a street name, or a district name, that will help you look forward.

Note

Those kind of articles are also useful to collect the operation start date of the datacenter, which is a very important information for further analysis. Note this as well and put it in your contribution if possible. Keep the url of the article as we need to source everything we collect.

If you don’t have a city name

But only a region / land name, try the same with ā€œdatacenter REGION_NAME inaugurationā€. This still gives meaningful articles sometimes.

If you don’t have much on the location

Sometimes just browsing for the websites of the companies involved in the datacenter investments, design, construction or operations, will lead you to meaningful enough informations. Look for the official websites of those companies and articles mentioning a new datacenter or a new project for a client builds a new datacenter.

If none of those work, someone might have found it before

Many websites try to reference datacenters around the world. It is most of the time forbidden to programatically get data from those websites, but it might help identifying a single datacenter manually. See useful sources (don’t hesitate to propose new ones if not listed).

3. Get a satellite image of the Datacenter, and the land plots it occupies

Find this datacenter on a map, multiple sources of data could give you this opportunity:

Warning

Once you are position on the right location and see the facility, check the date of the images you are looking at. Depending on the map you use those images can be outdated. Even if the date is recent, there might be some old images labelled as recent ones, it’s worth checking the same location with 2 or 3 different maps to identify anomalies.

4. Cross compare the data between satellite images and official registries

Once you think you found the exact location, challenge this asumption by looking for official records of the building construction, selling, and/or land and building ownership. This could depend on different sources depending on the country you are looking at.

5. Collecting surface data šŸ¢

Now is the fun part. How big is the facility ? It’s land plot ? The building (might be multiple buildings) ?

Land occupied by the building

Example of a dimension measurement of a datacenter

Animation showing an example of measurement of a building land occupation, on Google Maps

Land plot surface

The land plot surface might be tricky to measure manually, as the shape is very random.

The easest way to get this data is to get it from the official land registry data, see the list of building / land registries identified so far.

6. Collect data on other indicators

šŸ›¢ļø Fossil energy metric

Potential useful sources: operator’s official website (often expressed as a list of generators with a given power capacity), official regulatory records (ā€œICPEā€ in France for example)

⚔ Electricity metrics

Potential useful sources: operator’s official website for power, estimation based on the surface of the building (documentation coming soon on that part)

šŸ’§ Water metrics

Potential useful sources: based on cooling technologies, WUE when published, local temperatures, estimations based on previous studies, etc.

ā„ļø Cooling technologies

Operator’s official website, satellite images, etc.