Hacker News new | past | comments | ask | show | jobs | submit login

My reading of that is that they compared apples and oranges. They didn't use NLP there at all (stemming, parts of speech tagging, etc); they just relaxed handling of whitespace. The Nevod 'equivalent' was a less general expression to make it seem more maintainable.

The Nevod example translates to (ruby):

    pattern = Regexp.new( "(?<Name>ejection fraction|LVEF)( by visual inspection)?
                           (?<Qualifier>(is|of)( (at least|about|greater than|less than|equal to))?)
                           (?<Value>[0-9]+(-[0-9]+)?|normal|moderate|severe)".gsub(/\s+/, "\\s+"), 
                         Regexp::IGNORECASE)

    ["ejection fraction is at least 70-75",
    "ejection  fraction of about 20",
    "ejection fraction  of 60",
    "ejection  fraction of greater than 65",
    "ejection fraction of 55",
    "ejection fraction by visual  inspection is 65",
    "LVEF is normal"].each do |line|
      puts line
      puts pattern.match(line).inspect
    end
The only trick I used was to substitute literal whitespace in the regex with a whitespace pattern, so that the typed regex was more readable.



If you're looking for a tool that allows you to incorporate legitimate NLP approaches, you should have a look at `odin`. Here's a paper https://doi.org/10.1093/database/bay098 showing its usage in the medical domain.

And the code is open-sourced as part of the `processors` library out of the CLULab at the University of Arizona: https://github.com/clulab/processors

The most detailed (though not completely up-to-date) documentation is probably in the manual here: https://arxiv.org/abs/1509.07513

I'm using it at my current job to build an analysis tool for customer-agent phone calls.

It allows you to build rules that match on different levels of abstraction: tokens, pos-tags, dependency paths. You can even match tokens based on word similarity (as measured by cosine similarity of word vectors).

And these rules can "cascade" (i.e. build off of each other). So you can find an entity or event in rule 1 and then look for how that interacts with another matched entity or event in a later rule.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: