Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ć Programming Language (github.com/pfusik)
290 points by Lammy on Oct 9, 2021 | hide | past | favorite | 177 comments


All: I understand that when a post like this shows up, everyone's first reflex (including mine) is to react to the name. However, reflexive reactions like that tend to be shallow and to lead to boring, generic discussion. Here we're going for reflective discussion, not reflexive—which takes longer but ends up being less predictable and therefore more interesting: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...

The idea here is to let the quickest, shallowest reactions go, and wait for the more interesting ones to show up. With a post like this one, that would mean focusing on the details of the language. It's not that names are irrelevant, but we shouldn't focus on the surface at the expense of the depth—doing that leads to exciting-in-the-short-run, but boring-in-the-long-run results.

https://news.ycombinator.com/newsguidelines.html


I'm a little unclear on whether this just produces programs that leak memory when using the C target.

Are there any samples of what the generated code looks like in various languages?

I would also like to suggest a project name change as soon as possible.


From the author on an issue:

> cito has no own garbage collector. You get what the target language offers. If it's C#, Java, JavaScript, Python, then there is a GC. In C, C++ and Swift, there are stack variables and reference counting for dynamic allocations. In OpenCL there are only stack variables.


So circular references when targetting C and C++ leak memory yes?


Yes, you can make a memory leak when targetting C, C++ and Swift. Same as when you code in C, C++ and Swift directly.


Right, but when you write C, C++ and Swift directly you can also make a program that _doesn't_ leak memory. The question posed was whether that's even possible in Ć?


Yes: Ć has "dynamic references" (`shared_ptr`) denoted as `T#`, "read-only references" (`const T*`) denoted as `T`, "read-write references" (`T*`) denoted as `T!` and "storage" (`T`) denoted as `T()`.

So you can (and probably should) do almost any memory management as you would in C++. Except I don't see any alternative for `weak_ptr` at the moment.


> So you can (and probably should) do almost any memory management as you would in C++

Well that's already out the door because I almost never use shared_ptr in C++ code. The C++ core guidelines recommend using unique_ptr whenever possible. If you're going to use shared_ptr literally everywhere you do dynamic allocation, you'd be better off using a tracing GC to avoid the extra pointer indirection (and cache miss) with every dereference.

It seems to me that Cito should have manual memory management because a manual memory language can be trivially mapped onto a GC language (just turn every free or delete to an nop), while the inverse problem is intractable in the general case.


Please show us real code where std::unique_ptr vs std::shared_ptr makes a considerable difference (with benchmarks). A pointer indirection causes a cache miss?

A shared_ptr can be optimized to unique_ptr if it's not copied.

Not sure what you mean by "a GC language". Most cito targets are garbage-collected.


I actually tried to wipe the accent off my screen a couple of times, haha. I dunno, this feels like it could be neat but agreed with everything in your comment.


At first I read it as: "C Programming Language" and "C programming language. Translated automatically to C", I was like... what? The main reason I actually clicked on comments is because I thought it was related to the actual C. The name is very confusing.


Just to add: I did not see the accent because the font is too small. It is barely visible on, say, Á, too.


I did not see it either

Even though I put HN at 120% font size

At least the github page heading has a large enough font size to see it


There are more languages and ways of using the Latin script than just English. Many languages use diacritics.


I'm very aware of that. But because my mind interpreted it as the incredibly well-known and ubiquitous "C Programming Language", and that my screen often has little hairs on it, it was a natural reaction. I actually do think the name is quite clever, but very prone to confusion.


'see programming language' would be more searchable, and would illustrate one use-case for the project (write some code, see what it looks like in a bunch of C-esque languages).

Also, you could make jokes about the holy see programming language.


Holy see would get confused with HolyC tho: https://rosettacode.org/wiki/Category:HolyC


I'm suggesting that just 'see programming language' be the name, but references to it as the holy see could cause confusion. Something like the great schism of programming languages :)


RIP Terry Davis. What a titan among men. He belongs up there with the likes of John Carmack.


It's really sad he's gone, TempleOS was such a cool 64-bit operating system. It's unfortunate he got bullied so much that it ended up taking him away from us. People can be so mean :-(


RIP Terry, you will be missed! TempleOS is the tombstone you rest under. :-)


Terry, I miss you. May you forever be immortalized in the amazing legacy of TempleOS and Holy C that you left behind. Ticketmaster, screw you for not appreciating the smartest programmer in the world. There is a special place in hell for TicketMaster! While Terry resides next to the Lord himself!


The Wiki had solved it using "Cee":

https://wiki.c2.com/?CeeLanguage


As a fellow Polish coder, I find the name extremely amusing. I don't know why the accent should be a problem.


Entering the name is not a problem. Searching for it very much will be.


Use cito for search. Just like "golang" vs "go".


Because most of us will have to go on the hunt to find the keystrokes to get the symbol.


I took the liberty to read a couple of your previous HN threads, and from context there's a chance you're living in US and you are an English native speaker.

Forgive me if the above is incorrect, but in case it is:

The majority of your world's fellow humans, wishing to be comp developers, already spent months or years (combined) of their individual lives in order to learn English as non-native-speakers to do just that.

Your "whining" (yes, it's intended to be somewhat impolite, but I don't think it's unreasonable given the above) that, in case Ć becomes the next-gen well-known programming tool, you'll have to spend maybe a couple of hours to update your keyboard mapping or learn some other way of quickly typping 'Ć' invokes 0 of my (and many others') empathy. Though many were helpful to suggest you how to do that quickly/correctly.

I hope my words will be deemed as 'fair enough given the context' :)


I highly recommend using a compose key to enter special characters. I think it's more common by default on Linux, but I use WinCompose. I've never entered "ć" in my life, but hitting ALT, apostrophe then c, it worked.


Just store it in your brain as "cito" and google will do the rest, e.g. google "cito programing language" or "cito GitHub"


Back in the day you could enter these with alt-... numpad keystrokes. Decimal ASCII representation maybe? Might still work in some situations?


Yep, still works. Also does four digits iirc. For extended unicode.


Right-Ctrl C ' if you have a compose key (built in on Linux, WinCompose[1] for Windows, no clue for Mac).

[1] https://github.com/samhocevar/wincompose


On Mac, hold down the "c" key and a little menu with accented versions (ç, ć, č) appears, with numbers to choose which one. The flip side is that holding down letters or numbers doesn't repeat them on a Mac.


My old mac muscle memory is telling me option+e, c.

I don't have one anymore to test though, but I used to use the mac layout on Linux for a while exactly for composing using alt.


Hmm, I get ´c when doing option-e c. I feel like it should work to combine the two, but ... Looks like that functionality only applies to the vowels (and not y): á´b´c´dé´f´g´hí´j´k´l´m´nó´p´q´r´s´tú´v´w´x´y´z.


I'm not seeing that behavior; my letter/number keys simply repeat. But I'm sure I've seen it before. I guess there may be an option somewhere to enable/disable it, but offhand I can't seem to find it.

Or maybe this has changed (back) in a recent OS update? I'm running 12.0beta on this machine.


It's been around since Lion or Mountain Lion, IIRC. There's a toggle in keyboard settings to choose between repeating or hold-for-alternate behavior (like how iOS works). The substitution behavior is the default, but if you already had repeat mode on it may have carried forward. The old way was option+e and a vowel for an accute accent mark, but I don't think that works with consonants.


Yes, I expected a toggle in the keyboard settings, but I can't seem to find one on Monterey. (What exactly is it labeled, and where located? Maybe I'm just being blind...)

The dead-keys like Opt-E for acute only work with a small set of letters in the standard US keyboard layout, although it's possible for a different layout to support more -- subject to the combinations existing as precomposed characters in Unicode. (For other combinations, you'd need to enter a combining accent after the letter, and rely on the font to supports placing it properly.)


The (Neo)Vim digraph is `C'`: Ctrl+k, Shift+c, '


That gives me a Ç

edit: Looks like it's supposed to work, I don't know what I'm doing wrong: https://help.ubuntu.com/community/GtkComposeTable


Are you in en_US? The sequence you mentioned should definitely work in that locale. Here’s a more authoritative link than the manually-updated wiki page: https://gitlab.freedesktop.org/xorg/lib/libx11/-/blob/master...

If you’re not in en_US you can find other locales and their Compose sequences in the nls directory.


what do you get using comma?


The same thing, ç

edit: hold on, that's only in Firefox. Anywhere else, comma+c does ç and apostrophe+c does ć


Most of who? Most people in the world are not native english speakers.


In biology, it’s Latin. In computing, it’s English. Fellow ESLers, let’s just get over it.


Not exactly. {, }, [, ], (, ) are not part of the English alphabet but they are widely used. Same thing for ->, =>, !=. But things like λ are not used at all. So the alphabet of computing is ASCII.


> {, }, [, ], (, ) are not part of the English alphabet but they are widely used. Same thing for ->, =>, !=. But things like λ are not used at all.

Yet your sentence, clearly written in English, contains all these characters.


English also contains many accented words: https://en.wikipedia.org/wiki/English_terms_with_diacritical...

You are supposed to be able to type them if you write in English. Also, if you are in polite company, it's alright to drop the occasional Greek word. So here we are: if you can fully write in English, you can use all these characters.

Ascii-only is the modern equivalent of all-caps. Something archaic, still used by old people with hopelessly limited input devices.


Mac and iOS: hold C down, push Ć.


Have you tried the APL keyboard?


Because my screen reader does not know how to read the symbol and reads gibberish?


Sounds like the screen reader's problem. They better get on fixing that


You are not wrong, but you are not right either. See my explanations under other comments. I'd say that in their attempt to be fancy, the authors of the project are making its usage unnecessarily unpleasant.


Weird, it's just a "c" with an acute accent, most basic fonts (Arial, Courier new, Tahoma, Times New Roman, Comic sans) support it and screen readers should fallback to any of those https://www.fileformat.info/info/unicode/char/0107/fontsuppo...


The role of a screen reader is not to recognize the fond, this is job of the hosting application (os, browser). A screen reader takes the text and converts it to sound. As I wrote in my other comment, not all unicode symbols are mapped and part of them are skipped because of lack of capacity or due to performance.


I'm astonished about how much fuss a simple non-ascii character can make for people in this thread, as if it's the first time people see it.

I use such characters every day because my native language has them, so maybe that's why my tooling is naturally chosen/adapted to support it.


You are quite correct. I'm from Eastern Europe and consider your own reaction if I right a tool which has a name in cyrillic and every time you need to use it or search its functionality, you need to perform a complex incantation trying to remember the exact steps to generate something that at a glance looks just like "рир".


Out of curiosity, what does your screen reader emit? "Acute accented C"?

And would the onus not be on the screen reader to not produce gibberish for a simple letter?


It has a default placeholder pronunciation for unknown symbols. Neither english or my native language have accents in their letters and creating a lookup table for all the unicode material out there might've been considered impractical or performance-hitting for the developers. Personally, I'm okay with that because symbols with long pronunciations create overhead when reading and a few of them make a text practically unintelligible.


I try to be mindful of making my software as accessible as possible, but the following

> creating a lookup table for all the unicode material out there might've been considered impractical or performance-hitting for the developers.

just doesn't ring true to me in any way for current software. I understand that people can be using older software, which is why I strive to restrict myself to ASCII as much as possible for the widest possible support for my users, but my software also supports unicode identifiers, up to and including a whole unicode table to talk about confusables[1]. And not all TTS software "ignores" characters, which is why people advice against using 𝑓𝑎𝑛𝑐𝑦 unicode because it doesn't get read as text but instead each character is described individually. (This is also something that TTS software should support for their users' sake, but I digress.)

To be clear, it is reasonable to be practical and cater to the software as it exists, but that doesn't mean that we shouldn't ask for better software.

[1]: this is thanks to the crate unic-udc containing this information: https://github.com/open-i18n/rust-unic


I think that you touched one of the reasons why not all unicode characters are supported. You said that it is not recommended to use identifiers with confusables because they will be wasting the time of the SR user. Imagine though that someone decides to write an entire text in english using lets say the french or german alphabet, putting accented or umlauded letters everywhere where it is possible. Now, one can say "lets convert the confusable to the main form and read it like that", but what happens if you stumble upon a new text where there is a meaningful difference between the accented and not accented letters? The situation is not hypothetical, because very often one can find either a multilanguage document or a book with with names and single words in another language.


Makes sense. Thanks for explaining.


Ć push iOS keyboard C and it is there.


They do leak memory sometimes. Cito frees variables, but not temporary values. It also has no chance of knowing whether a third-party function takes ownership of an object or not.

The simplest example is string interpolation, I've just submitted an issue: https://github.com/pfusik/cito/issues/26


To be fair, the line between "Cito does not frees temporary values" and "the programmer screwed up by using wrong kind of references" is a little blurry. Most of my code in the issue above can be attributed to me writing dangerous code and getting shot in the foot, not Cito's issues.

However, I don't think a project may simultaneously have dangling references and claim to follow "Principle of least astonishment". Dangling references are unsurprising if you've been using C++ a lot, but I expect them to be very surprising to people coming from basically any garbage-collected language: Java/C#/Kotlin/JavaScript.


I do not expect anyone to code a C or C++ library with no awareness of memory errors. This isn't a safe language on top of C++. If you target C++, you need to know it.

On the other hand, if you only know "safe" languages and target "safe" languages, there will be no dangling references.

By POLA, Ć doesn't reinvent keywords (compare to Rust) and the translated code is meant to look "obvious" compared to what you wrote in Ć. It's not "knowledge of C# is sufficient to write code correct in eight other languages" (that would be quite astonishing actually).


This is made to compile to a bunch of targets, presumably to make things easier, but then literally the first thing they do is just explain exactly how many inconsistencies there are in their number and array logic between platforms, and that it’s up to you to keep track of that.


I’m a huge fan of projects like this (and Haxe) that compile to readable code in the target language. I think they should be a standard tool that people reach for whenever they’re writing logic (as opposed to plumbing).

I think they’re challenging to popularize, though, because people already like their favorite language and love all its unique features, while these projects need to be a lowest common denominator by definition.



Changed above from https://github.com/pfusik/cito. Thanks!


Still I find the https://github.com/pfusik/cito simpler to explain what it exactly does with the short example.

@pfusik please maybe put that example also in `ci.md`.




"Optional (but recommended) UTF-8 BOM."

Why? Everyone else I've seen is not recommending the BOM and trying to get rid of them. They cause issues in various systems and you could just infer UTF-8 from the file extension.


There are still many Windows programs that default to current Code Page instead of UTF-8. I prefer to have the encoding explicit in the file contents rather than rely on some external file type configuration.

This is just a loose recommendation, cito does accept files without the BOM.


But are these "many Windows programs" appropriate ways to edit code in Ć ?

It seems to me that if you say Ć program text is UTF-8 then that is an explicit encoding, and that if you feel it isn't explicit enough, an actual way to write out the encoding unambiguously is needed instead, which a BOM doesn't provide.

I am a little concerned by the wording "cito does accept files with the BOM". The BOM was chosen because it's a zero-width non-breaking character and so doesn't really mean anything, if cito thinks it means something, that's likely to be a problem elsewhere. For example if I concatenate two related Ć files, that ought to be fine, but I wonder if there's a BOM in the second file its presence in the middle of the concatenated file causes trouble.



Cool idea, I like languages that compile down into C/C++ (like Nim), leverages the ubiquitious nature of C compilers across multiple platforms. Are there any samples of the final sources? It says it's designed for library development, but if there's a error/bug in the generated code am I actually going to debug and parse it effectively? Generated code may be "readable" but it is it understandable? Or am I going to have to try and fix it in the original code (which may be difficult to pinpoint the source of the error if context is lost during compilation)


> Are there any samples of the final sources?

https://github.com/pfusik/cito/issues/21

I generally check-in just the Ć source and not the translations, but if you want a quick look at the generated C code, here's some: https://sourceforge.net/p/asap/code/ci/master/tree/asap.c

> if there's a error/bug in the generated code am I actually going to debug and parse it effectively? Generated code may be "readable" but it is it understandable?

I had no problems with that so far. Nim adds a lot of boilerplate code in C output. cito sometimes adds a few lines here and there, but mostly it looks like the code you would write directly.


Thanks to LLVM compiling your language via C seems to have plummeted in popularity somewhat.

It used to be a somewhat popular option for Haskell as well.


And Eiffel, which uses a JIT based VM for development and then compilation to native code via the platform C or C++ compilers.


> Are there any samples of the final sources?

Have a look here:

https://github.com/pfusik/datamatrix-ci


Those aren't samples of the final sources. It's a C-acute program, that output a library in the various languages, and example sources that make use of that output. Although there's usually no reason committing an output, this case is probably an exception.


I wonder how it deals with the lack of a garbage collector in C and C++. I could only find a single paragraph about it in the language reference and it doesn't seem to explain how to free memory in C, or how to deal with circular references in C++.

> Memory management is native to the target language. A garbage collector will be used if available in the target language. Otherwise (in C and C++), objects and arrays are allocated on the stack for maximum performance or on the heap for extra flexibility. Heap allocations use C++ smart pointers.


Evidently reference counting is used for C and C++, which means that circular references won't be collected, but in the absence of cycles everything is cleaned up.


I wonder how hard to get it to output lua code?

I have a set of runtime environments that only support lua but this Ć language looks interesting.

Obvious aside: I do agree with the many others that the funky line accent above the C doesn't help with typing during either discussions or searching, I'd expand the name purely for clarity. This isn’t actually bikeshedding - a name is used many many times in many contexts. It matters.


Reindexing array accesses by +1 in Lua would be interesting…


Is there a demo showing Ć code and output generated for other languages?


Not talking about the interesting naming or the circumstances around this particular implementation, what do the people on HN think about the concept of such transpilation?

I recall taking a course in university about model driven programming - the idea of creating an abstract representation of logic, interfaces and other system components and then generating either full implementations or stubs in multiple languages was an interesting one, even if implementations were really hard to get right.

In practice, i've mostly only seen one language specific model driven design tools, like JHipster (https://www.jhipster.tech/) or the likes of JPA be reasonably successful, since there's a lot of problems with supporting abstractions across different languages and runtimes, but what has been the experience of others in that regard?


Transpiling is a cool idea. Although erasing syntax differences between languages is easy, erasing semantic differences is extremely hard. The simplest example being garbage collection: if you transpile to C, you have to track garbage yourself and collect it. You either end up leaking some memory, double-freeing, writing your own GC or prohibiting some code to be transpiled to C.

There are more subtle ways. For example, overflow of signed 32-bit int is:

* Well-defined in Java. I assume in C# as well.

* Well defined differently in Python: it starts doing arbitrary precision arithmetics.

* Well defined differently in JavaScript, because there are no integers, only IEEE 754-like 64-bit floating point numbers.

* Completely undefined behavior in C and C++. See the "Signed overflow" section at https://en.cppreference.com/w/cpp/language/ub and https://godbolt.org/z/y4vIi1 specifically, if you assume reasoning about undefined behavior is allowed in general case.


When generating C you can quite easily cast all signed to unsigned, do the operation, and cast back, if you want to avoid signed overflow.


Except that doing so does NOT avoid signed overflow. You're converting effectively w-1 bits of precision (plus a sign bit) to w bits of precision, doing the calculation, and then converting w bits of precision back to w-1 bits of precision -- that only works for a limited range of input values.

Using int32_t and uint32_t, try adding -2147483648 and -1. The arithmetic result should be -2147483649, but a program converting back and forth between signed and unsigned will produce 2147483647 -- wrong answer and wrong sign.


You do avoid "signed overflow" in the sense of the C and C++ specifications by working on unsigned numbers. "signed overflow" is UB, "unsigned overflow" is not.

Of course you will never be able to obtain an unrepresentable value (such as -2147483649) through a workaround, and if you cast back to signed you get the wrapped result (which, indeed, may have an unexpected sign). But the point of transpiling to operations on unsigned numbers is to avoid UB, not to escape basic computational bounds.


Super relevant use case: you are making an app for multiple platforms, say iOS and android. You have some common logic, but for the most part the apps are separate, native apps. This would let you write the common part once, without needing to do something like drop down to c++ and then do native interop from swift and Java.


I feel like this is a use-case which sounds a lot better on paper than it actually is. Mobile apps in my experience tend to be shaped quite heavily by the system API and UI frameworks they are built on top of. When you talk about the "common part", usually this is like the networking layer and models maybe, which tends to be very simple and takes like a half hour to write in the first place.

Even if there is a little duplicate work, it's still better to have the whole app implemented in the native language and frameworks, rather than having some odd bits which don't work with the same tooling for example and are harder to debug.


If the logic is standalone (e.g. algorithms, data structures, validations etc.), then that's indeed a really good approach!

But in my experience, you oftentimes have to work with libraries and ecosystem components which are platform dependent. In those cases, the cross platform code would only work when these ecosystem components are also written in that particular language.

Now, it can work with Xamarin Forms, React Native or similar technologies with very specific use cases, but it feels like any other, more generic attempts at getting something like that working are doomed to fail (e.g. when you don't have a cross platform solution in place, for example, different ways to access DB on different platforms, or different ways to make web requests etc., or even interact with device capabilities or handle permissions).


i'm exactly in this situation : have an ios swift app and wonder what's the best way to get it to android.

Swift and kotlin are so similar i wonder if there isn't a possibility to generate at least the type definitions from one to the other. That would let me ensure that the code design at least stays in sync (at least for the model layer)


I find it very cool. I was thinking recently of something similar, and it's interesting to see it in practice.

I can imagine there might be some serious tradeoffs in terms of making something which has to be interoperable with all these different target languages.


It's not a new idea. I believe one of the longer-standing projects is Haxe.


I had a quick reflex reaction akin to reverse peristalsis to Bjarne Stroustrup's paper "Generalizing Overloading for C++2000" when I first saw the title then read it, until I finally realized he wrote it on April 1.

https://www.stroustrup.com/whitespace98.pdf

>Generalizing Overloading for C++2000

>Bjarne Stroustrup, AT&T Labs, Florham Park, NJ, USA

>Abstract: This paper outlines the proposal for generalizing the overloading rules for Standard C++ that is expected to become part of the next revision of the standard. The focus is on general ideas rather than technical details (which can be found in AT&T Labs Technical Report no. 42, April 1,1998).

>Introduction: With the acceptance of the ISO C++ standard, the time has come to consider new directions for the C++ language and to revise the facilities already provided to make them more complete and consistent. A good example of a current facility that can be generalized into something much more powerful and useful is overloading. The aim of overloading is to accurately reflect the notations used in application areas. For example, overloading of + and * allows us to use the conventional notation for arithmetic operations for a variety of data types such as integers, floating point numbers (for built-in types), complex numbers, and infinite precision numbers (user-defined types). This existing C++ facility can be generalized to handle user-defined operators and overloaded whitespace.

>The facilities for defining new operators, such as :::, <>, pow , and abs are described in a companion paper [B. Stroustrup: "User-defined operators for fun and profit," Overload April, 1998]. Basically, this mechanism builds on experience from Algol68 and ML to allow the programmer to assign useful - and often conventional - meaning to expressions such as

    double d = z pow 2 + abs y;
>and

    if (z <> ns:::2) // …
>This facility is conceptually simple, type safe, conventional, and very simple to implement.

At least I'm not the only one who fell for it:

https://groups.google.com/a/isocpp.org/g/std-proposals/c/uTO...


Also see Kotlin: https://kotlinlang.org/docs/multiplatform.html

In Kotlin you can write a library usable from C/C++/ObjC/Swift/Java/JavaScript.


I’m fascinated when one domain wanders into the expertise of another. In this case branding. I bet Ć has some fantastic technical ideas and may be a powerful tool.

But how much does it limit its own future just by having a name that isn’t distinguishable in speech from a much larger brand?

Though this makes me rather curious about products that succeed despite their branding. What happens if we make an unpronounceable language and it takes off? Would it just be given a de facto name by the community?


They can always change the name later. Like Nimrod -> Nim. But the intrigue of this name got a click from me so it is effective at mitigating PL infant mortality.


>One of the hardest things about LaTeX is deciding how to pronounce it.This is also one of the few things I'm not going to tell you about LaTeX, since pronunciation is best determined by usage, not fiat.

Leslie Lamport

Btw the GitHub repo is named cito .


AFAIK Xiaomi created its Mi subbrand because of ‘Xiaomi’ being too difficult to pronounce by non-Asians. But now that people got used to the ‘difficult’ version, they’re getting rid of the ‘Mi’ subbrand.


What brand is that? Ć is pronounced [t͡ɕ] I am not aware of other brands using this sound.

The bigger problem is probably that English speakers can't pronounce it :) On the other hand Chinese should have no problem.


Probably pronunciation is not as important as the ability to find the character on one's own keyboard.

I mean, I probably never pronounced Python or Ruby in the same way a native English speaker does but I can write them and Google them. I have to copy and paste Ć or look for it in my keyboard's hidden characters (easier on my phone.)


Like Thai Qi?


> What happens if we make an unpronounceable language and it takes off? Would it just be given a de facto name by the community?

Yes. Nobody calls ECMAScript by it’s official name. We just call it JavaScript.


Bikeśhed.


C-accent?


More like C-acute. C-accent would be C' which, I am sure, will be invented at some point in the future or past.....


Ć is C-with-acute-accent. C' is C-with-apostrophe.


still waiting for c-grave č also know as crave


Similar to C-dièse, the way some French are saying C# based on how it's written.


apparently Cito (pronounced cheato?) but I'm not sure about the rationale


if the author gets this polished enough, it could be a _fantastic_ bindings generator!


as i understand author is Polish so not sure if this is joke or not :)


* CORBA flashbacks *


Shudder. Thanks for reminding me of that hellscape :-(


Take a look at SWIG (Simplified Wrapper and Interface Generator), WebIDL and similar projects. There are binding generators of all kinds.


Was that a pun?


Registering Ç and Č trademarks now.


Č is the natural successor to Ć :)


Don't forget Ĉ, Ċ, Ḉ (a dialect of Ć with influences from Ç), Ꞓ (mostly used in Europe) and my favourite programming language ©


Thanks for sharing! The practical value of such languages could be to build an interlingua and support Python->Ć->Go. The question is: if Ć->Python works seamlessly, how easy it is to make the opposite translation? For instance, take some code that uses external library, like spacy.

If this becomes ever possible, it is going to be revolutionary.


Very and very hard if you're aiming for readability. Ć should be "intersection" of all supported languages in a sense, so it does not support features unless all targets do. For Python that would be dynamic typing.

Of course, it's possible to introduce a type like 'any' which can refer to any object, but then you essentially get untyped Ć which results in untyped Go. Which is not really Go anymore, it's more of a Python interpreter.


Thanks! One practical target I imagine having is: 1. write the program in Python. 2. translate it into Ć ("interlingua") 3. compile Ć into Go -- for instance there is a belief around me, that Python is not suitable for web scale loads, while Go is more suitable.

Go source code / binary in this case are of less importance for code readability, because they are meant for production deployments. Something that happened in GWT: write in Java, compile into JavaScript.


In this part of the world the name would be quite political. We only use the letter Č (pronounced as ch) but our neighbors also use Ć (pronounced the same, only softer.. it's literally called "the soft Č"). There are a lot of 1st/2nd/3rd generation immigrants from those countries and their last names almost always end with Ć. So "he/she has a Ć" became a rude way of saying "he/she is an immigrant". And spelling someone's name with a Ć when it's actually Č can result in a very heated argument. I have no doubt you could find developers who would refuse to work with Ć :)


Honestly as good as most of the excuses I've seen for "Reasons I won't work with _____ langauge"


Any hints of where that part of the world is to avoid it? I suppose Czech Republic?


It is Slovenia, as others have guessed. It's just one way people find to express their predjudice, I'm sure others exist in any other part of the world. My last name ends with a C and it sometimes gets written with a Ć and I don't lose my mind because I'm not a bigot. Also, I'm happy to see more and more people standing up for their identity - as in, please put Ć on the form because that's how it's properly spelled; yes, my ancestors come from the south and no, I don't care what you think about that, just do your job properly!


It is remarkable that this exact border (although without the proverbs on Ć) manifests also between Czech Republic and Poland -- in the Czech language there is exactly one hard "Č" and no soft version, whereas Polish has both "cz" (the hard variant) and "Ć" (the soft variant).


It seems we are talking about Slovenia. It’d be a shame to skip a country for something small like this, particularly one as nice as Slovenia


It's not a matter of a programming language skipping a nice country, but of some stupid bigots in a nice country self-owning by skipping a programming language, of their own free will.

Discouraging bigots from joining a programming language community is a good thing for that community.


You’ve misunderstood my comment and the one I’m replying to. This person is talking about chalking off an entire country because of a comment they read on HN, without giving it much more thought. Not really my business but IMO they’d be missing out and to be honest you could use this line of reasoning to eliminate basically every popular destination


Anti-immigrant bigotry is "something small" in your opinion?


The existence of people who are anti-immigration would prevent you from visiting the UK, the USA, Italy, Germany, France, Japan ... so yes in this context it is a weird place to draw the line and is pretty small in the context. I'd be curious what country passes this particular purity test ...


Č is present in Czech, Slovak and South Slavic languages of Balkans, as well in Lithuanian, Latvian. It was created by Jan Hus who worked on Czech orthography in 15th century.

The written representation of this sound might vary - Polish Czeski and Czech Česky but the sound mostly remains same.


No, something on Balkans


Which part of the world is "this" part of the world you are talking about?


Probably Slovenia.


Does that result in the cito => cheato compiler for ć?


I guess you are from Slovenia :)


A language "meant for implementing portable reusable libraries". Seems like a nice idea.


Not a new one. https://haxe.org.


"Haxe can build cross-platform applications." Although it also has APIs as an example use case, it is more general.


Haxe compiles to a bunch of languages. It's the same thing, just with a fleshed out standard library.


...it solves one specific problem: how to write code that can be conveniently used from C, C++, C#, Java, JavaScript, Python, Swift and OpenCL at the same time.

Super interesting. Could something like this be created for declarative UI paradigms like SwiftUI and WPF?


I wonder how networking is handled. For example reading/writing data from/to UDP/TCP socket can be a whole lot different in e.g. C# and Java.


Perhaps it could be used as a semi-fair language benchmarker. I would like to see the benchmark game where every implementation is through this language.


Yes, benchmarking is interesting. I will post benchmarks of different target languages.


I wonder what it takes to add output to Rust, or if that’s even possible…


+1 for the clean handwritten parser.


Does it have a Linq equivalent?


No. As much as I love LINQ in C#, I don't see it easily translating to all the target languages.


Yes, that is why I asked; I definitely see the issues with it. It would make a major difference though...


This really makes me think of https://xkcd.com/927/


I sat here trying to scratch the accent mark off my screen for a minute.


I have to admit I scrolled to see if it was on my screen or the character.


There are more languages and ways of using the Latin script than just English. Many languages use diacritics.


Did anyone else scroll their screen up and down in attempt to get rid of the dust more above the C?


I tried to wipe it a few times.


A text processor written in C#, GROUNDBREAKING!


Will be difficult to search for on Google.


Good prank. Anyway I hope it's a prank.


Why?


It would have been nice to see the compiler implemented in the language itself. I'm a little skeptical of any language that can't be used to compile itself.


Other than being hard to achieve (often being a secondary goal), Julia gives very good reasons why not to strive for this goal: it may influence the language design in undesired ways, e.g. By removing the focus from high-level features to low level system design features


I'd like to rewrite cito to Ć at some point. One problem is that it's written in a fairly recent C#, so either Ć would need to catch up with two decades of C# development or the codebase be downgraded to an older language.

And we all use JavaScript NOT implemented in JavaScript. ;)


I'm a little skeptical of JS too!


The compiler is implemented in the language itself.


It’s implemented in C# (e.g. CiParser.cs)


It seems to be written in c#.


Looks like the parser is hand crafted using recursive descent method: https://github.com/pfusik/cito/blob/master/CiParser.cs

I did not look closely but writing parser without any tooling is asking for troubles, from parsers that enter infinite loops to not handling parse errors properly. ANTLR is way better for writing parsers for simple languages.


> I did not look closely but writing parser without any tooling is asking for trouble

Says who?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: