Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Facebook data analyzer (github.com/lackoftactics)
85 points by lackoftactics on April 11, 2018 | hide | past | favorite | 13 comments


Nice work. I created something very similar with Python (https://github.com/leerob/facebook-data-analyzer). It looks like yours touches on some things I didn't get to in mine, like ranking messages. Awesome idea!


Hi, I'm the creator of Facebook Data Analyzer. I feel overwhelmed with support I got from community, we already fixed some issues for users to use script. Thank you hacker news.


Where can I find the data a similar list of the most popular words for other languages? Is it a known format?



I posted on https://www.producthunt.com/posts/facebook-data-analyzer. Would be really grateful if you could upvote!


Dark pattern brain fart : excellent way to save up user data from users leaving Facebook.

Get it while you can!


Don't worry, Zuckerberg will save it for you.


*Palantir


Palantir will not be involved [waves hand in front of your face]


It's worth noting that the messages exported from facebook with their tool are often truncated. It seems to be more comprehensive with your more recent contacts, so analysis will skew favorably to people you were in contact with most recently.


It looks like within the past 8 months or so that Facebook has changed to format of their data dumps to not truncate messages, as their previous data dumps were previously structured as one giant messages.htm file which would be difficult to parse and seems like it had missing data for certain cases.


I haven't seen truncation, but it is misleading the way it breaks up conversations. Instead of getting full threads, you'll get chunks of convos in chronological order which makes it a nightmare to follow anything.


Staying on top of these undocumented pseudo-formats is a real challenge. That's why it's a good idea to not wait to archive stuff.

When I wanted to analyze my Google Voice history of 9 years, all of the scripts to parse it didn't work anymore, so I had to write one: https://github.com/unqueued/googlevoiceparse

Google Takeout's HTML archives weren't exactly friendly when I wanted to drill down and find certain patterns.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: