Hacker News new | past | comments | ask | show | jobs | submit login
What's the Most Concave State in the U.S.? Using R to Solve a Geography Puzzle (rapgenius.com)
177 points by lil_tee on Sept 25, 2013 | hide | past | favorite | 57 comments



Fun stuff, especially if you're a U.S. geography nerd (states with the largest coastlines, states that border the most states, all that good but mostly arbitrary stuff)

Pedantic, I'm sure, but from the title before I clicked on it I was trying to think of the state whose shape might independently be considered the most concave (though that may be much harder to define). This version of concavity depends largely on the shapes of the states around it (e.g. if Nevada split into 6 horizontal states, suddenly California would be the winner).


That was what I thought it would be at first as well. My definition of "most" concave: pick two points at random in the state. What is the probability the segment connecting them leaves the state? In this case the hard part is defining what to do about water. As an example, I live in Maryland. It is easily the most concave if you don't treat the Chesapeake Bay as part of the state. If you do, not quite as concave. See maps in the link for the shape of MD if you are not familiar with it.


I was going to go with "Divide the land area of a state by the area of its convex hull. Which is the lowest percentage?"


My first thought was to go with the length of the state's boundary that is on the convex hull divided by the length of the whole boundary.

So...we've got at least 3 ways so far to assign a measure of convexity/concavity to a state, and they give quite different results.

For instance, your method and the method in the comment you responded to both would assign a low concavity to a state that is almost a square, except that the boundary has a high frequency, low amplitude sawtooth pattern imposed on it. Mine would score that has a very high concavity.


I was interested in this and did a very quick google search. Measuring concavity/convexity of a generic 2D surface object doesn't seem to be a very common operation. The one reference I found seems to do what you describe:

http://www.ncbi.nlm.nih.gov/pubmed/7945702

Although I don't have PubMed access so I can't be sure.

This is an interesting problem.

I chose the "random points" method because it seemed easier to me to get an approximate answer for a generic shape rather than try to figure out areas. Lengths didn't occur to me though and seems interesting, although the shape you describe doesn't "feel" concave to me.


Yeah that was what I was expecting they'd do also.

Maybe you'd want to compute the convex hull on a per-connected-component basis, though, to avoid allowing small islands to have undue influence.

Either that, or say that any area of sea within its convex hull gets treated as part of the state's area. This approach differs in that it special-cases sea over other kinds of land.


Would you not get the mean height above sea level of all points along the border and divide that by the mean height above sea level of the entire state (which I guess is really an average of some sampling of points)?


For that suggestion I was just considering it a two-dimensional problem with a "sea or not sea" distinction, not using heights above sea level.


Maths challenge:

I wonder if you could prove this "probability of a random line segment violating convexity" definition equivalent to something given in terms of a ratio of different areas like the area to convex hull area suggestion below.


If you're going to include water I think Hawaii may have Maryland beat. Maybe even Alaska?


I forgot about islands...

But I don't really think of Hawaii as "concave". So I'm going to refine my definition to only allow the points to be chosen in connected areas.


> So I'm going to refine my definition to only allow the points to be chosen in connected areas.

Are two areas considered connected if they cross a river, once?


I thought it was going to look at which state had the biggest average drop in elevation from its border, which in hindsight is silly since the entire surface of the earth is pretty convex...


On the scale of most US states, the convexness of the surface of the Earth is fairly minimal.


A state with a mountain range would be fairly convex by comparison. An entirely flat state, like Kansas, would probably have the lowest relative center to edge ratio.


I was thinking of concavity as a state which had the most distance outside the state between two points inside the state, somehow proportional to size or area. Florida seems particularly concave to me (it's practically shaped like a banana), Massachusetts slightly less so.


I think this metric might weight Tennessee higher than one would intuitively desire. Perhaps:

1. Draw a line segment between two connected points in the state, such that only the endpoints of the line are contained in the state.

2a. Draw a line segment perpendicular to this one, such that one end touches the first line, the other end touches the border of the state, and only one endpoint is contained within the state. The length of the largest such line is the "concavity" of the state.

2b. Alternatively, measure the area contained between line #1 and the state border. The largest such area is the "concavity" of the state.

2c. Alternatively, measure the ratio of the area in #2b to the length of the corresponding line in #1.


Doesn't the length of the coastline depend on the scale you measure at?


Any idea whether this depends on the projection used to get the map in the first place? I mean, the "straight lines" are really curves that lie in the boundary of the Earth's surface, unless I'm missing something major.


It doesn't matter. He's counting the number of times a continuous curve C on the surface of the Earth crosses other continuous curves (state boundaries). A crossing is a property of the curve and its embedding onto the Earth's surface. While a (continuous) map projection can deform both C and the state boundaries, it cannot create or destroy crossings.

To draw the crossings, however, he needs to pick a projection and project both C and the state boundaries, which I guess is why he included some PROJ.4 calls.


It does matter - he only considers continuous curves that are projected onto straight lines, which is a property of the projection.


That's an excellent question. Latitude and longitude are straight in the projection he is most likely using, but great circle arcs are what we consider straight on the globe. The kicker is that most political boundaries are neither. They are Rhumb lines, aka loxodromes, which mean that someone started walking in a direction and kept going.

The standard solution for this is to put lots of little points into the state GIS definition, so that the points get transformed correctly. That way short line segments don't differ by more than a few meters. That means you have to watch out for simplified state representations, but not much else, unless you're being a stickler.


> Rhumb lines, aka loxodromes, which mean that someone started walking in a direction and kept going.

To be clear, they mean that you keep going in the same compass direction.

If you kept going in the direction which seemed straight ahead to you there on the ground, then what you'd get (under suitably idealized conditions) is a great circle.


Locally, it would zig-zag all over the place, like the US-Canada border: http://www.youtube.com/watch?v=qMkYlIA7mgw


So with a few minor complications convexity generalises to Riemannian manifolds like the earth. You need to replace "straight line" with "minimising geodesic" i.e. shortest path, which don't depend on the choice of coordinate chart, just on the Riemannian manifold structure (which includes an inner product hence a metric).

This is complicated slightly when there isn't a unique shortest path between any given two points (e.g. the earth's north and south poles), leading to definitions of strongly convex, convex and weakly convex. See http://en.wikipedia.org/wiki/Geodesic_convexity and the debate at http://en.wikipedia.org/wiki/Talk:Geodesic_convexity#Dispute...


Yeah, I'm wondering if the projection used has the property that straight lines lie on great circles. And from what I remember about geoids, it seems like the term "great circle" wouldn't even apply to all of them.


I think the projection becomes relevant when you're looking at "straight" borders like Colorado. A great circle will bend slightly north of the northern border, so you can actually just barely touch Wyoming and Nebraska, but I don't think the projection is significant when looking at some of the more irregularly shaped states


Tennesse's line looks fairly curved so I'm guessing it's done on a sphere.


No, the plotting is all done in geographic.

That's just the shape of TN's border. It's not a straight line in any coordinate system. (For example, have a look at google maps, which is in geographic. The northern border of the state roughly follows a parallel, but the details are more complicated due to history and local politics.)

I didn't look at the code in detail (and my R is quite rusty), but the fact that he's using the geosphere package suggests that the intersection calculation is being done on a spherical shell, rather than cartesian space.


> It's not a straight line in any coordinate system.

I bet I could propose a coordinate system in which it was a straight line. ;)


Aaaaannnd now I'm playing FTL again. Lining up those 5-room beam strikes is just too much fun!

On topic, though, this is pretty cool. Rivers and coastlines seem to be the best way to get appropriately jagged borders. It's interesting to look at states across the map from east to west and see the shapes get simpler and more geometric over time.


Cancavity has a simple definition, its measure is: area of the region divided by the area of its convex hull. Most concave is value nearest to zero. You can't just make up definitions.


Well, actually the point of definitions is that you can just make them up, but if you're using a non-standard one you really need to say so. I do find the definition in the article pretty pointless (aside from the fact that it has way too many points).


Spelling has strict definitions too. You can't just make up spelling. SCNR


It is an odd definition of concave. It would make more sense to me to require that the line joining two points on the state's border does not cross in and out of the original state when determining the number of other states it crosses... This doesn't really capture concave as a geometric concept either but more aligned with the idea that it is a local property.


I would have thought concave meant in the Z dimension. Find the state with the highest elevations on any two sides with the lowest relative elevations in between.


Pretty cool. But why is this posted on RapGenius?


In anticipation of their launch for GeographyGenius. Social map annotation, which state is the dopest, etc.


...I would actually love social map annotation.


Pretty sure they have some of those already. There's OpenStreetMap, and I seem to recall another project which was more "name this rectangular area" oriented... I forget its name... anyone want to help me out here?


Isn't that basically what Foursquare does?


I'm thinking of people (geo nerds) who like to look at maps and remark about things. Not people who want to tell the world they're at Starbucks.


Have you used Foursquare recently? While it originally focused on broadcasting one's location, it's more about social recommendations now, which really isn't far from "social map annotation" in practice. Especially when you have people leaving notes in parks and bridges and buildings telling you about neat features and not just "the denver omelette is amazing here!"


I have not, that's pretty interesting. Looks like I have some browsing to do between Wikimapia and Findery.

Thanks, guys/gals.



Do you mean something like http://wikimapia.org/ ?


Everyone has hobbies. Theirs just happens to be algorithms and geography.


This reminded me that I want to finish reading the book 'How the States Got Their Shapes.' http://amzn.to/16zBxBo There are some crazy reasons some states have their strange borders.


[Insert discussion here on the propriety of hidden affiliate tags in short URLs.]


It doesn't matter to me in this case. The book is relevant to the discussion and the poster is a human.

But I can certainly understand some people here may want to draw a line in the sand (to use a metaphor from the linked book's description) to prevent spam.



apologies. Group attitude noted.


It appears that the author makes a mistake in his attempts to simplify the problem, because although he is correct that he only needs to look at points on the edges, he goes on to suggest that he is looking only at corners of the polygon, and not at any of the (infinite number of) points between the corners.


Interesting. But an odd definition of concavity. Surely a better definition would be the state whose area relative to its convex hull is smallest. I'd guess Hawaii or Florida off the top of my head.


It would be neat if the algorithm found the most illustrative solution for each case (e.g. maximizing minimal distance in each state). Especially the Arkansas one is obviously the edge case.


Is this a clever way of recruiting data engineers?


"We have enough venture funding to pay people to work on non-core parts of the business. We are not under that much pressure to make money. The normal work of the business is not sufficiently rewarding so we bribe employees with pet projects."

(http://blog.prettylittlestatemachine.com/blog/2013/02/20/wha...)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: