Hacker Newsnew | past | comments | ask | show | jobs | submit | flaviotsf's commentslogin

I recommend doing disaster recovery steps for your personal data as well, such as Gmail. At one point recently I was creating filters to delete bulk messages and - when the filter got created, it somehow missed the from:@xyz.com domain part and I ended up deleting => delete forever all emails. I noticed the issue right away but it was enough to wipe 2-3 months worth of emails (all of them, even Sent ones).


I simply had to download the Mac / Windows (https://teams.microsoft.com/downloads) client and things worked. I never found a "web version" of Teams - but it probably exists somewhere. :)


The web version is located at https://teams.microsoft.com/, and you just login there.


I tried Neo4j a while back for recommendations and calculating similarities between users but when running against our full dataset got too many OutOfMemory exceptions. Ended up with a Mahout / Spark solution. It's an awesome graph db though - can find many other uses for it.


Yeah, I'm surprised the Neo4j team hasn't made more of an effort on this. I've run into lots of memory issues with it as well, and although there are reliable, fairly straightforward solutions to most of these problems, the team doesn't seem to be particularly interested in making sure that the defaults are robust enough to handle a reasonable workload. When your database fails on you for making a reasonable query request on a light workload, you can't help but feel troubled. There's a lot to love about Neo4j, but they've got a lot of work to do if they want to win over the developer community as a whole. There may be enterprises that get reassured by a huge price tag and a whole bunch of salespeople at their beck and call, but I don't know any of them. Every engineer I know who is willing to pay for software is either expecting a completely new kind of product or expecting to have an awesome experience with a free version of the tool before being willing to commit even a few bucks a month.


Yeah I've tried a couple of times at getting Neo4j into stacks but the outcome has always been it's pretty much limited to baking relationship data pre/on demand that is saved elsewhere and cleared out otherwise you get into prohibitively expensive licensing / infrastructure territory very quickly.

At that point a more pragmatic solution has always won.


Exactly the same as you, I was just trying out neo4j today with a small dataset (30mb) and was getting memory exceptions trying to add a relationship.


Would you mind sharing the query? If you're hitting OOM exceptions with a dataset of that size there may be a typo in the query that's doing some sort of traveling salesman operation.

e.g.,

//grabs literally EVERY node in your database

MATCH (Person)-[KNOWS]-(Friend)

//only the people who have a KNOWS relationship between them

MATCH (person:Person)-[:KNOWS]-(Friend:Person)


the solution we are moving to is to use spark to compute similarities, etc and load it into a neo4j graph.

so we use neo4j for oltp and spark for the olap part.


Can you though? My impression is that it doesnt scale to large data sets. The use cases for true graph databases (over shaky implementations on HBase/Cassandra) sparse in my opinion.


6 of this, half a dozen of the other.

It's a single image database (no partitioning except in memory), so all nodes in the cluster will have the complete dataset (thus each node must be large enough to store it). However, because Neo4j doesn't rely on joins / table scans to operate-- traversals are O(1) not O(n). So there's an advantage to doing OLTP work on really really large datasets that have a specific starting point. Neo4j will do pointer arithmetic instead of scans / joins, such that regardless of dataset size a query will only access the fixed amount of data. The reason for this strategy has been that scale up hardware pricing has come down incredibly quickly in the last decade and having a trio of 64+++ GB memory boxes isn't out of the question for most mid-size and enterprise companies. Secondly, distributed systems are non-trival problems to manage both from a development but a devops perspective as well.

The philosophy of the Neo4j team is to conquer the world slowly. In order of priority Neo4j is designed around:

1.) data integrity and availability (ACID transactions, master-slave replication)

2.) rapid reads for graph traversals

3.) ability to store web-scale datasets (trillions++ of nodes)

4.) parallel operations (multi-master, map-reduce, global analytics, etc.)

The product has firmly completely 1 and 2, and is starting to work on 3 and 4 (4 mostly with a databricks / spark partnership).

It fights the same CAP problem that all databases do. We've chosen Consistency and Availability. Partition tolerance just isn't something inherent to graph databases. We can do some really smart math and duplicate nodes with high betweenness centrality (data nodes, not servers) or shuffle data based on access patterns to prevent introducing network latency into query plans that access nodes on multiple partitions. But doing that while maintaining 1 and 2 of the above is very not easy.

Disclaimer:

MATCH (rhino)-[:WORKS_AT]->(neo4j)

WHERE NOT rhino.opinions = neo4j.opinions


I wanted to build something like this - basically a P2P based End-to-End encryption messaging technology (so it doesn't get inspected by peers) for use cases such as messaging (a-la SMS) on a non-networked environment such as cruise ships / National Parks, etc [maybe through bluetooth or something]. Cruise lines should / could easily build a messaging app that works on a closed network (lots of branding potential.. not sure why it hasn't been done yet!). If anyone is interested ping me. :)


Do you mean something like Bittorrent Bleep?

Haven't used it, but same premise.


Yes, exactly. Would be great to brand bleep to certain audiences - but the functionality is exactly what I thought. Thanks for the link!


See also Briar (https://briarproject.org/) which has the added benefit of being open source


Hi John, I also would love to test it out! https://keybase.io/flaviotsf

Thank you so much and congratulations on the release!


At work we use a combination of git + Jenkins and a simple xcopy on build succeed. Does a great job, plus it can do Slack events, email notifications, etc.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: