Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
buboard
on Nov 9, 2019
|
parent
|
context
|
favorite
| on:
Deep learning has a size problem
google seemed to make a genuine effort to make a model that is useful rather than record-breaking with bert. But i think it's wrong to consider it the "final" model upon which everything else will be built.
bitL
on Nov 10, 2019
[–]
BERT is already outdated, but still useful as you need only 1 Titan RTX to retrain its BERT_large model via transfer learning.
turnersr
on Nov 10, 2019
|
parent
[–]
What methods make BERT outdated? Do you have pointers to other options?
bitL
on Nov 10, 2019
|
root
|
parent
[–]
e.g. XLNet:
https://arxiv.org/abs/1906.08237
phreeza
on Nov 10, 2019
|
root
|
parent
[–]
XLnet is Bert with a bunch of additional training tricks.
bitL
on Nov 10, 2019
|
root
|
parent
[–]
BERT is a Transformer with a bunch of additional training tricks. Transformer is self-attention with a bunch of additional training tricks...
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: