There are mainly three ways to denote code blocks:
- Delimiters: most C family languages like C/C++, Java, Rust, use curly braces
{}
- Keywords:
begin...end
like in Ruby. It's the most verbose solution, adds a lot of useless overhead to the parser, and often makes language grammar ambiguous. - Indentation: like in Python
U uses curly braces to group statements. Therefore there is no difference between a function or a scope inside the function body.
U will trigger a warning as variable a
in scope 1 hide parameter with the same name a
.
About Indentation¶
This section is not a rant about Python. It's just a pragmatic view of the reasons why we don't use indentation as code blocks.
The indentation feature is visually pleasing but leads to some annoyances. Those annoyances are easily handled with braces while not making them less readable or less visually appealing. Annoyances include:
- An IDE is required to automatically write correct code, or you end up losing time formatting your code.
- Unlike JSON, machines cannot efficiently share code as it's intended for humans. U aims to be used in protocols too. So we can not use indentation.
- Files cannot be concatenated: in Python, if two files have different indentation levels, you cannot join them,
- Text cannot be correctly Copied/Pasted: copy/paste out of web pages into a text file breaks the code structure, except if the code is wrapped in a
<pre>
tag. Even between code editors, copy/paste often requires a second stage of indentation.
The most important annoyance is that Humans and the Parser do not see the same thing: it's like there are two languages simultaneously – One intended to be read by humans, and another intended to be read by the parser. These two languages may say different things, the indentation suggesting a visual logic, the parser understanding something completely different.
This example aims to compare Python and U code blocks. Both sources could be written in one line.
In Python, humans see:
and the parser sees without indentation:
if x > 2\n else:\n print 'error'
The parser has to reconstruct blocks' start, end, combine new lines and spaces to define blocks' bodies and statements.
With U, no such divergence:
Both humans and the parser see
:? x > 2, { \n < 'ok'\n}, {\n < 'error'\n}"
The parser does not need to do extra work that might introduce errors.