Hacker News new | past | comments | ask | show | jobs | submit | more ptest1's comments login

TOML is extremely verbose for lists. It’s ridiculous. Sure it had less features than YAML but it also looks considerably worse in all the common use cases. Looks like INI


Yeah, I was fighting the list problem. And not able to mix numeric and string data in a list just made the whole thing unwieldy and frustrated me.

I’ve used JSON for configs, which is not great, but at least readable. The biggest problem is lack of comments, but I added a quick hack to ignore any dictionary key starting with an octothorpe. Not ideal, but is actually handy because it makes it easy to comment out keys temporarily as well as add arbitrary commentary.

Also, the Python standard library JSON parser is very flexible and yields precise exceptions. I was able to turn those parsing exceptions into meaningful error messages with precise line and column numbers. Which came as a bit of a shock to some of my users, because a different tool written before my time but used by the same people had exactly one error message for mispelled YAML: segfault.


I wholeheartedly and in ironically disagree!

We have all we need. If you want a full featured language, use that. Beyond that, we have JSON, YAML, INI and more. If you want something more complicated you can create your own DSL for your app.

If you want schemas and validation, use XML!


each of those options have major downsides:

- 'dumb data' (json, yaml, ...) is machine-, but not human-friendly. it's often tedious to write/read, and you can't abstract common parts out

- DSLs are something you have to write, debug and maintain yourself. is the bug in your config or is it in your DSL's implementation? who knows!

- full featured languages require a full blown interpreter and aren't tooling-friendly. as an example: with a Python package, it's not really possible to statically determine the dependencies , because its setup.py can declare anything it pleases depending on, say, the time of day. also you can't really run untrusted configs because they might launch some missiles

---

there's a decent middle ground – write a program that generates a 'dumb data' config. you write a small amount of code (friendly for humans) and run it to get a static, easy-to-process config (friendly for machines). however in practice (in most languages) the program won't be pretty – probably about as easy to read/write as an implementation of a macro that directly manipulates ASTs (i.e. not very). this can sometimes be ameliorated with some EDSL trickery, but that brings back all the problems DSLs have

and so generating configs is what projects like Dhall/Cue aim to improve. i'd say they're aiming to be something like the regex of config generation - do a limited amount of common&useful things, and make them easy to express.


Stop commenting on every HN post, thanks


Fact: I've commented on fewer than 1% of HN posts today alone.

Fact: my post is very relevant to the OP and I'm offering to help them.

Fact: I've been a member of this site for over 12 years, and never comment on a post unless I think it adds value to the discussion or would be helper to the parent.


Fact: What you wrote was essentially a private, directed comment that should have been a private email, and added literally nothing to the discussion.


> should have been a private email

That's a good suggestion. I didn't think of that. But thinking about it, I guess I don't want to bother the OPs inbox. If they are interested, they can get in touch.

> added literally nothing to the discussion.

I disagree. When I post a new language, and someone shares a link to a related language, those are often the most valuable comments.

Validating, defining, using data: sound like things Tree Notation syntax is perfect for. Cue's semantics are great, and presentation and execution, I just think potentially a syntax switch is worth exploring. I understand the strategy to be able to parse JSON as cue, and that's probably the way to go for now, but in the future Tree Notation syntax might offer compelling advantages.


FWIW I think your post is very relevant.

I like the idea of a git based database. Please don't take a single person's opinion as that of the entire community.


So..why not use XML?


Because while XML has types and schemas, the actual implementation and ergonomics is absolute garbage. Those things matter to intelligent people.


So that humans eyes don't bleed when they try to read it.

One of the things they bring up in the docs is lessening boilerplate. And it's hard to get more boilerplatey than XML.


Schemas, interesting. I wonder where we’ve seen that before..


We could have a URI that gives you a schema and some tools to generate a skeleton for that schema for you. Then you could put a header in your request for what you want to do and send that payload.

Like a simple configuration access and management (SCAM) protocol.

Yes, yes I think we’re on to something here.


This harkens me back to the era of random publicity “multimedia” CD-ROMs. I remember being /thrilled/ that I got a free CD-ROM from Toyota!

Before websites were really a thing, having your own multimedia CD-ROM to give out was some kind of status symbol!


In 2017, Tumblr had ~400 employees. If each employee had a modest $50k retention bonus, that’s at least an additional $20M. So I’m not sure the total cost for the purchase as $3M- maybe that was just for the corporate assets.


Another argument the author may not have considered is that by prematurely releasing his implementation, he may be making it more (not less!) likely that future discoveries are hidden away from the public.

It seems OpenAI wanted to release this project in phases, allowing people time to adjust to its nature. If in the future an even more disruptive project is created (by OpenAI or others), if the creator feels they cannot release it in their own perception of what a “safe” way is, they may simply avoid publishing and instead privately communicate with companies and powerful individuals. Which I don’t think is the outcome the author wants. So I hope he reconsiders here.


> It seems OpenAI wanted to release this project in phases, allowing people time to adjust to its nature.

Do you really think anyone besides AI enthusiasts are paying attention? It's not like the general public is even aware of this, let alone following its progress.


I may have spoken too generally here. By “people” I meant e.g. engineers at Google, Facebook, Reddit, news outlets, that kind of thing. I see it a bit like a security disclosure.


I understand OpenAI is experimenting with their release of the GPT2 model, but I still don't understand their reasoning. If it's too dangerous to release today, what's going to change in the few months before they release it? They don't say why it's too dangerous beyond hand-waving, so it's impossible to be able to protect against that.

Security disclosures are much simpler - we found a vulnerability and we will provide time for the company/team/organization to patch it before announcing it to the world so it won't be exploited by bad actors.

If OpenAI truly feels they have something akin to nuclear weaponry, and that fewer actors having it is better, than they have to openly admit that they consider themselves better gatekeepers of the technology than the public and back away even further from their non-profit/limited profit ideals. "We are creating this technology for the good of the world, but it's so good we are afraid to let you use it, so only we will benefit from it."

I find them wildly inconsistent in their messaging, trying to have the best of everything with none of the drawbacks.


I don’t get it- the title is totally incorrect. He hasn’t written a third of what’s on Wikipedia.


I assume they're dividing his "almost 3 million edits" by the "more than 5.7 million articles" on en.wp.

It's obviously extremely wrong, but their incentives aren't for being sticklers on accuracy. It's on being sensational.


Yeah, that makes sense. But it’s still wrong!


Actually, to follow up on that, there've been a few articles about him, and on Quora[0], the author of a WaPo article[1] says "Of the site's nearly 5.7 million pages, Pruitt has edited a staggering one third."

If this guy is correct, the edits are sufficiently spread across different articles for him to have touched 1/3 of them. Still obviously not the same as writing, but it puts the headline in a slightly less absurd light.

[0] https://www.quora.com/Who-is-Ser-Amantio-di-Nicolao-of-Wikip...

[1] https://www.washingtonpost.com/lifestyle/magazine/meet-the-m...


Well, to be fair, the title doesn't say he wrote it. Just that he's behind it. Which to me doesn't have any clear or even vague meaning. If someone asked, I couldn't tell them. But it is vaguely arguably true, I think. I've seen much worse.


yeah you're right, its not even a little bit accurate.


CBS News. I don't even really notice sensationalized or misleading headlines so much these days, until they're pointed out to me.

Perhaps this guy's contribution count represents one-third of a top-contributors list, and the author conflated that with "all of Wikipedia" out of laziness? Curious that "Steven Pruitt" (or any usernames that look like they might be owned by him) doesn't even show up on any of Wikipedia's own lists:

https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_...

https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_...


He is on those top-contributors lists. His username is "Ser Amantio di Nicolao" per article.


His username, Ser Amantio Di Nicolao, shows up on both of those lists.


But it isn’t a clock because it doesn’t tell us how much time has passed. PoW allows for the creation of an ordering. I’m not sure these metaphors are helpful.


I believe the reference is to a vector clock [0]

[0] https://en.wikipedia.org/wiki/Vector_clock


That makes a lot more sense, thanks.


> But it isn’t a clock because it doesn’t tell us how much time has passed.

Not necessarily: https://en.wikipedia.org/wiki/Logical_clock


It tells you the time in number of blocks. Block height is the unix timestamp of blockchain-based timestamp servers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: