For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | marklit's commentsregister

Thanks for pointing that out. I'll update the post.

I love that idea. I don't have time for anything elaborate today but I dropped two visualisations at the bottom of the post.

I love the radial one, which looks like it was laid out as a "mirror tower" installation and then maybe converted to PV?

Thanks, interesting to see!

141 scenes in this dataset have MP4 counterparts.

I can't get the vector basemap to render in QGIS properly with any other projection. I love using EPSG:3301 for Europe, etc.. but loads of strange things happen. Even 4326 is an issue.


Glad you enjoyed it.

There should be enough SQL in the blog to re-purpose extracting out the Wildberries locations and seeing where they land on top of. I've never heard of this firm before you mentioned it.

From Google:

> Citibank operates over 2,300 ATMs within more than 600 U.S. branches, with a total network of over 65,000 fee-free ATMs

So the 57,163 Citibank locations are probably a combination of their branches and ATMs.

Update: I reviewed Alltheplaces a while back, they scrape company websites for store locations. They reported 68,227 locations for Wildberries. ATP is one of the sources Overture use but they seem to use 1.55M of the records from their 19M-record dataset. https://tech.marksblogg.com/alltheplaces.html


I contribute to ATP and can confirm that the author of the wildberries spider was deliberately trying to collect https://wiki.openstreetmap.org/wiki/Tag:shop%3Doutpost (online order pickup locations). It's not a common occurrence within the current set of ATP spiders to capture such features. A quick search indicates that OSM doesn't appear to have tags designed to capture pickup/dropoff partnerships between retail brands, for example, an agreement from a pet supply shop to allow collection of parcels from select fuel stations of a partner brand. Thus I think the author of the wildberries spider has used shop=outpost as the closest tag available in OSM, and Overture Map's filters wouldn't be able to omit these features from their dataset unless Overture Maps adds wildberries to their exclusion list.

Ideally ATP's "located_in" and "located_in:wikidata" fields would be populated for these wildberries pickup locations, making it clear the pickup location is part of a parent feature (e.g. fuel station, supermarket). These fields are specific to ATP and are not OSM fields. OSM would expect features to be merged and a hypothetical field such as "pickup_brands:wikidata=Q1;Q2;Q3" be used instead on the parent feature.

ATP has a much more inclusive set of features it can extract than what Overture Maps, TomTom et al care about. As Overture Maps is more opinionated on what they aggregate they will filter out ATP extracted features such as individual power poles, park bench seats, local government managed street and park trees, stormwater drain manholes, cemetery plots, weather stations, tsunami buoys, etc. I think there might be some exceptions if it helps TomTom et al with their products such as speed camera locations, national postal provider drop-off/pick-up locations within other branded retail shops, etc.


I'm having a hard time tracking down the OSM-specific actions around this but Australia moved 1.8M back in 2017 due to continental shifting https://www.ga.gov.au/news/news-archive/australia-officially...

The Airport in Tartu, Estonia had a navigation upgrade last year in order to help mitigate navigation jamming that's taking place in the region. https://www.eans.ee/en/uudised/tanasest-saab-tartu-lennuvalj...

Defcon had a great talk on all the different navigational systems pilots can use and a note at the end that these shouldn't be decommissioned at the rate they're experiencing atm https://www.youtube.com/watch?v=wSVdfOn737o


From what I've seen in news reports, China has built a lot of tower blocks that are 10s of floors, rather than the 4-5-floor buildings I saw when I worked in and travelled around India.

India has ~4x the population of the US so the ratio of buildings isn't much of a surprise.


The US has only used 25% of its land. Overture published a land use dataset a while back that could go some way to verify how much land on earth is urban, covered in forest, etc.. That might only need a single SQL statement given how they structured their data.

I have some analysis around that topic in the middle of this post: https://tech.marksblogg.com/overture-land-cover.html


That 5 TB of data will probably be 3-400 GB in Parquet. Try and denormalise the data into a few datasets or just one dataset if you can.

DuckDB querying the data should be able to return results in milliseconds if the smaller columns are being used a better if the row-group stats can be used to answer queries.

You can host those Parquet files on a local disk or S3. A local disk might be cheaper if this is exposed to the outside world as well as giving you a price ceiling on hosting.

If you have a Parquet file with billions of records and row-groups measuring into the thousands then hosting on something like Cloudflare where there is a per-request charge could get a bit expensive if this is a popular dataset. At a minimum, DuckDB will look at the stats for each row-group for any column involved with a query. It might be cheaper just to pay for 400 GB of storage with your hosting provider.

There is a project to convert OSM to Parquet every week and we had to look into some of those issues https://github.com/osmus/layercake/issues/22


I'm hoping to use most of this system for the next 10 years. At some point, I want to add some beefy GPUs to it when I get back into 3D again.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You