Hacker News new | past | comments | ask | show | jobs | submit | barbolo's comments login


I’ve been using DinoV2 for some months now. I’ve tried the models with 4 register tokens along with CLS + patch tokens. I’ve several embeddings (tokens) from previous model (no registers) which are part of my solution, so I didn’t adopt the newer “register” models because the CLS tokens are not aligned between 0 registers and 4 registers models. It would be nice if the CLS and patch tokens were somehow aligned between those models.


It's much worse than that. The leak contains dozens of datasets (relatives, addresses, jobs (+ linkedin), schools, vehicles, income, debts, pictures of faces, companies).


This leak is much more harmful. If the data really comes from Serasa Experian, they have more accurate and structured data from people/companies/assets in Brazil than anyone else.


Would that be a viable option to deploy TensorFlow models on serverless environments (Lambda, Functions)?


You can deploy TensorFlow model binaries as serverless APIs on Google Cloud ML Engine [1]. But I would also be interested in seeing a TensorFlow Lite implementation.

[1] https://cloud.google.com/ml-engine/docs/deploying-models

Disclaimer: I work for Google Cloud.


Thanks, @rasmi. I have a feedback for you guys. The pricing for predictions inference in GCP is not very fair. If I deploy a small model (like a SqueezeNet or Mobilenet) I pay almost the same price of someone deploying large models (like Resnet or VGG). That’s why I’m deploying my models on serverless environments and paying about 5 dollars for 1 million inferences.

The pricing of GCP is: $0.10 per thousand predictions, plus $0.40 per hour. That’s more than 100 dollars for 1 million inferences.


I see what you mean. To some companies, ML Engine's cost as a managed service may be worth it. To others, spinning up a VM with TensorFlow Serving on it is worth the cost savings. If you've taken other approaches to serving TensorFlow models to get around ML Engine's per-prediction cost, I'm curious to hear about them.


The main TensorFlow interpreter provides a lot of functionality for larger machines like servers (e.g. Desktop GPU support and distributed support). Of course, TensorFlow lite does run on standard PCs and servers, so using it on non-mobile/small devices is possible. If you wanted to create a very small microservice, TensorFlow lite would likely work, and we’d love to hear about your experiences, if you try this.


Thanks for the answer. Currently I’m using AWS Lambda to deploy my TensorFlow models. But it’s pretty hard and hacky. I need to remove a considerable portion of the code base that is not needed for inference only routines. I do that so the code loads faster and to fit the deployment package size limit. If TensorFlow Lite is already a compressed code, then it may be much easier to deploy it to a serverless environment. I’ll be trying it in my next deployments.


Sounds really interested. We're excited to hear about how that goes.


I’m reading over and over again since the last weekend. And I’m checking the code. And I’m still not understanding it.


Blocking audio means blocking visually impaired people from accessing websites with reCAPTCHA.

Hacking reCAPTCHA is not only for bad people. There are several use cases where solving reCAPTCHA automatically is needed.


What are these legitimate use cases for automatic reCAPTCHA solving?


Automating searches on a government website that decided to use reCAPTCHA just because it wants to look modern. There are dozens of them in Brazil for example.


Nice work. It also publishes a file mfcc.py which uses Mel spectrogram to solve the audio offline. With enough data, a model based on MFCC should work much better than any cloud service (general speech recognizer).


Another interesting fact is that TensorFlow 1.4 supports native MFCC spectrogram tensors.


Are you sure SES was the problem? Did you correctly configured DKIM/SPF?


We've occasionally got bounces like "554 5.7.1 Service unavailable; Client host [54.240.27.56] blocked using dnsbl.sorbs.net".

Amazon say SORBS is worthless but unfortunately someone is still using them https://docs.aws.amazon.com/ses/latest/DeveloperGuide/blackl...


Seriously, doing bulk emailing you find that mail services are often the dustiest, cobwebbiest things on the internet. I swear some of these things have been sitting untouched for literal decades.


I've also had deliverability problems with SES, and I absolutely did set up DKIM/SPF right.


Segmentation is not needed for a modern deep learning system, since this is learned by the neural network. It’s a solved problem for many handwritting recognition problems.


Would love to see a system or a paper which did solve this problem.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: