More

ivanpashenko · 2024-06-13T08:29:50 1718267390

sounds pretty advanced! can you share some examples of deterministic dimensions for scoring?

and what about llm-scoring: does LLM output passed/not_passed or it is more?

BWStearns · 2024-06-15T21:19:55 1718486395

So for context our app (https://nativi.sh) is a language correction app. It takes in text and cleans it up to make it sound more fluent/correct, it's basically geared towards being grammarly for your second language.

For some of our deterministic LLM tests, we have inputs that have known spelling errors but no wrong word errors, or some other combination of errors. If the config under test doesn't identify the issue, or identifies issues that we know aren't there then it's marked as being wrong for that test case. Then we can test across config x language x kind_of_error.

For the LLM vibe driven scoring we have it set up to just do a head to head between the current leading config (usually what's in prod) and the new candidate config rather than generating an abstract score. It will flag "x config straight up failed question N based on some_reason(s)" so that we can manually check it.

My partner wrote the testing framework. She's been thinking about cleaning it up and open sourcing it.

ivanpashenko · 2024-06-12T12:41:07 1718196067

"7 likes / no comments" --> should I read it as: people interested in others people experience, but have nothing to share about their own? - No prompt on production? - No testing or other routines about it yet?

Please share your current status :)

ivanpashenko · on Jan 6, 2017

Just launched http://ineedicons.com –– custom made outline icons. Will see soon if it has legs.

ivanpashenko · on May 11, 2016

And how is it going? Do people use it?

sheraz · on May 11, 2016

It is going well! (At least, I'm quite happy).

Between HN and Product Hunt on the same day we took in about 8,000 unique visitors, which resulted in over 350 signups.

Big drop-off since that early traffic spike, but is look like 40% of my traffic is still returning users. So that is interesting...

That traffic spike exposed a few big bugs which we closed this week, and now I'm figuring out next steps (marketing automation, more user acquisition, increasing sharing/virality).

Also, the more users I talk to the more I understand their use cases.

All in all, I'm loving it despite juggling this and my day job :-)

ivanpashenko · on May 11, 2016

How do you send it, via email?

ivanpashenko · on May 11, 2016

Which YCF debacle you mean?

cpersona · on May 11, 2016

Sorry, not YCF, but Apply HN.

https://news.ycombinator.com/item?id=11647165

ivanpashenko · on May 12, 2016

Got it, thx!

ivanpashenko · on May 11, 2016

Nice. How does it works?

sharemywin · on May 11, 2016

kinda like my own personal HN(lol)...I post links with hash tags in the title. I added a login but you can post anonymously. I just haven't worked on it in ages.

ivanpashenko · on Feb 25, 2015

Everyone who write to your public email have to pay. So basically new contacts. For friends and people you know you use your private email.

This is how a solution could look like: http://wrte.io

Jipha · on Feb 25, 2015

I could see it being used. Lots of people get hundreds of emails a day being pitched to or asking for help. Charging people (even a small amount) might get people to put more effort into their emails.

ivanpashenko · on Feb 25, 2015

My bad. Is the question not clear enough?

ivanpashenko · on Feb 25, 2015

instead of not charging (like it is now).