Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> 2. There is no published tooling for how someone else can reproduce the same data.

This is true, and is actually the whole point of Overture.

Overture was developed to enable private companies to leverage open data (like OpenStreetMap) but also combine it with their proprietary data and processes.

The intention is to share the result with a relatively permissive license (a new thing called the Community Database License Agreement) but keep the process and underlying data proprietary.



Yes, exactly. Overture is, for better or worse, looking to create a dataset that is more difficult to mess with / contribute to.

I'm big on OpenStreetMap, but I can't deny its a bit of a liability for Facebook and other companies that display maps to their users. There is the occasional vandalism edit that simply can't be shown to an end-user. Facebook put significant effort into maintaining a moderated version of the OSM database that lags behind the real-time edits. Facebook, Microsoft, TomTom, etc. know this is a ton of work and want to pool their resources. Making it open also helps to openly compete with Google, the other big map data provider.

If you want to contribute to Overture as an end-user, AFAICT your best option is to edit OpenStreetMap and see if your changes eventually get pulled in. Overture has promised the OSM community that they'll make much of their data available to be contributed back, we'll see if that pans out.

When it comes to AllThePlaces -- as an OSM nerd it seems like there is an opportunity to build a better bridge between this and OpenStreetMap, to make it easier to quickly update businesses in an area. Recently there has been a pretty successful push to link OSM data with WikiData, using tools like the OSM ↔ Wikidata matcher [0]. For POIs, it's a lot of work to add a bunch of local businesses, even with tools like EveryDoor [1]. It would be so cool to see AllThePlaces integration into RapID for example, if there isn't already(?)

[0] https://osm.wikidata.link/

[1] https://every-door.app/


> There is the occasional vandalism edit that simply can't be shown to an end-user.

I remember the same sort of arguments being made about how web sites could not possibly ever accept user comments or submissions, or could not ever risk having users sending links to one another. Those all proved to be false.


> I remember the same sort of arguments being made about how web sites could not possibly ever accept user comments or submissions

The heyday of comments sections on news websites is now in the past. Not long ago, Lonely Planet took down its renowned Thorn Tree forums, which had been a big part of the travel internet since the 1990s. Friends who run a major website for a particular hobby told me that they canceled their plans to launch a forum, since their site is advertising-supported and a forum could damage their relationships with advertisers.

Reliable moderation costs money, and if you don’t moderate heavily enough, you’re going to get user comments that tarnish your brand (or at least scare execs into thinking that the brand will be tarnished).


Editing OpenStreetMap isn't quite equivalent to commenting on a post. OSM allows you to edit anything, which is fantastic but also allows for more serious vandalism. We have seen major cities renamed to racial slurs, for example. As with Wikipedia, the community is generally very good about correcting these issues quickly. It's an uncommon problem that a lot of people work to mitigate. But I stand by what I said: vandalism on OSM is in many cases unacceptable to show to end-users.

The OSM basemap is used in many official publications, in many social media applications, etc. I'd actually recommend using the basemap for most simple mapping cases, as long as it's being continuously updated from upstream (or, it is the upstream basemap). If you take a snapshot of that data, however, you risk capturing some bad stuff. That is a real risk for a company like Meta.


I used to work in the industry. A few things I’ve seen make it into production maps shown to largeNumberOfUsers are phallic objects drawn as lines on the map and series of what appear to be random test lines drawn as roads in the Arctic. Neither of these examples are great publicity if they are discovered on a finished and shipped product people are paying for.

Interestingly, they’re also a lot harder to catch than the slur naming example


> We have seen major cities renamed to racial slurs, for example.

Once in 19 years, I think?

It’s not great but let’s not pretend it’s more of a problem than it is.


No, it's not a huge issue within the OSM community. But surely you see why it's an issue for large companies looking to use the OSM basemap in their projects.


Perfection is an impossible standard to uphold. Even if you do everything in-house, and even if you are Disney, you cannot avoid the occasional scandal:

https://en.wikipedia.org/w/index.php?title=The_Rescuers&oldi...


> not possibly ever accept user comments or submissions, or could not ever risk having users sending links to one another. Those all proved to be false

Funny thing is that this part is most often the cause of a data breach when looking at majority of pentesting reports.

One thing is to expose something to few ppl you know and another thing is a possibility to send things to millions of ppl that are constantly abused like Twitter or Youtube.


Comments are not core parts of content.

What this is taking about is closer to Wikipedia. And vandalism is a real thing there, so portion of topics is moderated.


Did they? I've seen a lot of e.g. newspaper sites removing comment sections, or switching them to not display comments until they've been moderated.


There are many links to legal content that Instagram and FB messenger will not allow you to send to your friends, so I'm not sure how vindicated you are.


That’s more of an indictment of Meta than an argument against uncensored user-to-user communication.


> It would be so cool to see AllThePlaces integration into RapID for example, if there isn't already(?)

Part of the problem is not entirely clear copyright/copyright-like status of this dataset. Thanks to https://en.wikipedia.org/wiki/Database_right and similar things (in general OSM is really careful with legal status of datasets being imported).


Ah, great catch. After posting my comment I came across this GitHub issue that touches on this point as well [0]

As always, I will be vigilant about checking licenses before adding data to OSM!

[0] https://github.com/alltheplaces/alltheplaces/issues/5133


Sadly copyright and copyright-like restrictions are really complex. I am not 100% entirely sure whether concerns that I raised in this issue are really problematic, but...


Wouldn't the closed processes and underlying data severely limit communities such as OSM from using Overture Maps results for anything other than a validation of what OSM already knows from other sources?

Perhaps Overture Maps has used impressively accurate satellite imagery tracing to detect the demolition and rebuild of a structure somewhere in Sudan, and can output a new polygon. No OSM mapper is setting foot in Sudan, and recent satellite imagery for the area is not available through companies that share such data for OSM use.

The issue for an OSM mapper who sees the conflict between OSM (with the old building) and Overture Maps (with the new building) is they don't have any information to know which result is accurate. Is OSM just out of date? Has Overture Maps produced the result from outdated satellite imagery and OSM is more up-to-date? Is the result form Overture Maps the result of a mistake in an automated tracing algorithm?


> Wouldn't the closed processes and underlying data severely limit communities such as OSM from using Overture Maps results for anything other than a validation of what OSM already knows from other sources?

Seems like a play at the old Microsoft "Embrace, Extend" approach. Whether or not there's an Extinguish after that is yet to be determined.


These are all valid questions, and commonly raised concerns, about the Overture Project.


Sounds like that's going to be a problem for proving/reproducing results independently then. :(


Wonder what could be reasoning behind this. Is it they dont want disclose data collection practices or cant do from legal points.

Things like it could come from alexa, pc or any devices devices with forced opt ins that keep scanning all your neighbours wifi networks and mac addresses.

Believe there was also an initiative where amazon devices will provide adhoc internet connectivity by piggy backing on other amazon devices on different networks with connectivity.

So all the openness but without any controls. There should already be a better term for things like this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: