Hacker News new | past | comments | ask | show | jobs | submit | more technologia's comments login

I’m glad to see support for GNNs with tensorflow. Working with gnns for the past few years, personally for me it gets tiring to roll my own framework.


What's an example problem for which such networks work well?


I've seen lots of good come out of GNNs in the biomedical space.

For example, in drug discovery you can treat any molecule and its biochemical properties as a graph so you can store the entire structure as a graph. So you make a million graphs of the biocompounds you know, a million graphs of random/similar compounds, and see which of the new million molecules the GNN will predict to work on disease X.

Long review here that might be closed access: https://academic.oup.com/bib/article/22/6/bbab159/6278145


They show a simple one in the post

> In the example below, we build a model using the TF-GNN Keras API to recommend movies to a user based on what they watched and genres that they liked.

> The code above works great, but sometimes we may want to use a more powerful custom model architecture for our GNNs. For example, in our previous use case, we might want to specify that certain movies or genres hold more weight when we give our recommendation.


Couldn't you use a regular DNN with one-hot encoding of all the movies seen by a user (and the corresponding genres)? And boosting can give more weight to certain movies or genres.


It appears the difference can be found in the depth of knowledge. A one-hot encoding, followed by an embedding, followed by some fully connected layers only takes the actual titles into account.

A GNN can take into account everything you know about movies, including incomplete data. Therefore a GNN will see that user A likes everything with actor X, user B really wants the genre to be Y, user C likes actor Z, but only before 2000, and combinations of that. Therefore the GNN can do better, hell, it can even predict what movie properties would do well, which would be tough to get out of the embedding network.

You could encode all this data in your embedding, but this will be a much smaller and much more flexible network.


Think of it as an ensemble for blending your normal NN features (ex: RNN for time/clickstreams) with a model that can also leverage useful graph features (document citations, app logins, chemicals connecting, social graphs).

We think a lot about security/fraud and digital journeys, where NN + xgboost are popular in general, and graph is used seperately (or upstream) for looking at broader structure. GNNs help blend these models. For example, in analyzing malicious user accounts (ex: misinfo on twitter), we already get many time/nlp/etc scores for whatever events/entities we look at, and use the social network structure to ensure better propagation/blending, similar to why boosting and ensemble methods became popular to beginwith. Feel free to DM if interested, we are quite excited by this space and working on some things here.


So, if I understand you well, it would be something like this. Inputs:

1. {x0, x1, ...} - nodes in the graph, say users

2. Bunch of edges like {x_i, x_j}, say social connections

3. Some raw or processed features on nodes, say the text of posts, age of the account or some nlp-based scores for posts

4. Some raw or processed features on the edges (in particular maybe some coloring)

Before: people would train various classifiers/regressors directly on the nodes and/or edges, then maybe use the graph structure to propagate the scores.

After: But instead you could train whatever objective you have from raw features on the edges and nodes, with some extra message passing between nodes and edges. For example train (some of) that nlp-based classifier together with the graph part. And the benefit would be that, for example, you can extract some signals from the NLP part that would be more useful in determining the properties of neighbors, but not necessarily as useful in determining the properties of the current node/edge.

Question - what's the maximum range of such message passing? Sounds a bit like an RNN, where the unroll depth can be an issue. Though in practice most graphs have a low average path length, so maybe this is not a particularly big problem.

Although if you start unrolling graphs you'll very quickly load ~everything, so I guess the training must be completely reworked (flush data to distributed storage frequently then shuffle for the next step) or you cannot unroll further than maybe a few steps.


Yes I think you are seeing it

Before: People might precompute graph scores ("pagerank", ...) and use as features for tabular NNs. Or use simpler and slow GNNs like GraphSAGE bc the domain fit was great (ex: Pinterest social recs)

After: heterogeneity and scale for graphs that fit in CPU RAM (1TB) w decent GPUs

Re:unrolling, yeah a bunch of papers there :) sampling, artificial jump edges, and adversarial techniques have been helping with aspects of generalization (far data, unbalanced data, ...)


I remember reading a bit about GNNs circa 2019. At that time it seemed to have mostly to do with point clouds (for LIDAR data and for 3-D modelling mostly) but I imagine things have changed lots on this front. Are there any interesting papers/resources you could recommend for one to get back up to speed?


From what I can tell, the field is indeed evolving very rapidly, but I have only worked on a specific application (knowledge graph completion), so I can't give an overview over all the current day applications. I can, however, recommend William Hamilton's excellent text book, which is available online [1].

[1] https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf


For enterprise relevance in our world, the exciting things have been handling heterogeneity via things like RGCNs, and handling bigger scales via DGL (GPU tricks, sampling tricks, ...). Imagine fraud, hacks, and entity resolution from everything you've recorded on a user interacting with a system.

There are important cases like maps and chemistry that take more specialized techniques, but we focus on events/logs/etc. So less to say on the niche stuff, even if those niches cover big use cases like "how google maps works" or "how google auto-designs their TPUs"

For the logs/events/transactions/clicks/devices/users/accounts cases, happy to chat, but maybe not as useful elsewhere :)


Any reason for not using pytorch? They have torch geometric


what's the state of GNN support elsewhere? does everyone else also roll their own, or are folks using Pytorch or something else?



The bottleneck in GNN computations is that the aggregation ops cant be expressed as matrix operations and require writing custom kernels. This problem was solved in PyTorch with torch-scatter. The other bottleneck is subsampling (e.g k-hop) which also dont benefit from GPU support. Other than that the embedding aspects can just be written as nn ops.


Deep Graph Library (DGL) is the big one, which can use either PyTorch, MXNet or Tensorflow as the backend and is developed by AWS. You also have PyTorch Geometric and Jraph, which is built on top of JAX and used mostly by researchers at DeepMind as far as I can tell.


GeometricFlux for GNNs in Julia: https://github.com/FluxML/GeometricFlux.jl


DGL is the other big one, it supports several frameworks (at least PyTorch and MXNet).


These aren’t franchises where some employee can make that decision, the artist would have had to reach out to Apple’s legal team to get consent to do this at one of their stores. Yea that might be cumbersome and make the work impossible to do but if you were in the employee’s shoes would you really want the onus of all of this on your head?


> the artist would have had to reach out to Apple’s legal team to get consent to do this at one of their stores

Oh that's the smarter thing to do for sure, but you're still going to be contacting one employee (who then potentially contacts other employees). I just found the emphasis on the quantity of employees in the GP comment unusual.

> but if you were in the employee’s shoes would you really want the onus of all of this on your head?

I would assume that if there was any doubt in their mind they could pretty easily go "I'll have to get my manager". If anyone with experience working at an Apple store can chime in that would be fantastic, but I suspect that a store manager in 2011 would have been able to approve installing software on a demo computer without having to go through legal. Presumably that changed pretty quickly after this stunt though.


No store manager in their right mind would approve this though.


Suddenly not so much Star Wars as much as it is Gundam :p


Or Culture series :)


The people who remain on Earth do nothing but pollute it, because their souls are weighed down by gravity!


I got a bunch of empty accounts and one startup launch post. Not sure how that qualifies as a doppelgänger


Dunno why but I’ve gotten 7 integration test emails so far


This is great, your demo’s music just made the task seem hilariously easy.


Shark tank throwback :p


Love what I’ve seen so far, I’ve tried superannotate but ended up opting to build my own AI assisted tools into label box, excited to potentially try this out.


Looking forward to hearing your feedback when you give it a try!


This was a fun exercise, definitely think this could be difficult to suss out for greener devs or even more experienced ones. It’d be hilarious to have this model power a live screensaver in lieu of actually being busy at times.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: