More

sparkie · 2026-06-13T08:12:07 1781338327

That may be true, but it may also mean you utilize more memory than you need to. If you aren't shrinking the array when you no longer need previously allocated capacity then you're wasting memory. You could end up with an array of 10 elements and an allocation of 2^10.

The capacity as bit_ceil(len) ensures that at most, half of the allocated space is wasted - excess space is O(n).

sparkie · 2026-06-13T07:25:48 1781335548

If you use a struct with a `void*`, you also need to specify the type on usage, where here it's done with `typeof`.

sparkie · 2026-06-13T07:18:21 1781335101

In C23 this approach is nice, but in older versions of C we end up with awful macros where we need to define the structure before we use it.

    #define Array(T) struct array_##T
    #define DEFINE_ARRAY(T) struct array_##T { size_t len; T *elems; }
    
    DEFINE_ARRAY(int);
    Array(int) foo;
    Array(int) bar;

C23 has relaxed rules for redefining the same struct, so we can avoid having to create the struct up front.

    #define Array(T) struct array_##T { size_t len; T *elems; }

    Array(int) foo;
    Array(int) bar;

sparkie · 2026-06-13T07:14:29 1781334869

The reason the struct is avoided here is so the array can be typed to its element type (rather than casting to and from `void*`).

With a struct we would need one struct for each element type - at least prior to C23 which provides a better approach where we can declare the same struct multiple times in a translation unit.

    #define Array(T) struct array_##T { size_t len; T *elems; }

We can use `Array(int)` in multiple places in the same TU - but in C11 or earlier, this is an error.

sparkie · 2026-06-13T07:07:47 1781334467

The concept of not storing capacity isn't silly. If you need to reserve space then it's not the appropriate structure, but it's otherwise fine.

However, using an 2-element array to avoid using a struct is silly.

sparkie · 2026-06-13T06:55:15 1781333715

Really? It's been done plenty and I thought was quite common knowledge. Some of the <stdbit.h> provided functions are basically for this purpose.

stdc_bit_ceil(len) gets the smallest power of 2 not less than len, which is our capacity. This is usually implemented with a clz instruction.

stdc_has_single_bit(len) determines if it's a power of 2 - typically implemented with a popcount instruction (popcount(len)==1).

The approach isn't used in older (90s and earlier) texts because hardware support for popcount/clz wasn't commonplace and the cost to do it in software wasn't worth it, but it is mentioned in some texts.

astrobe_ · 2026-06-13T07:13:03 1781334783

I think it was sarcasm. I have about as much experience as them in low-level C programming, and I was wondering why this is on front page. I've also discovered again a few things too, so I won't look down on OP. It's certainly better than vibe coding.

userbinator · 2026-06-13T07:36:51 1781336211

No, I'm being serious.

I know about XOR-linked lists and a few other tricks for very efficient memory usage in embedded systems, but it's my first time seeing implicit allocation lengths.

OskarS · 2026-06-13T07:44:36 1781336676

Everyone knows about bit-fiddling functions, what the poster was referring to is using this trick to not have to store the capacity for a dynamic vector. Which is genuinely very clever and rare, I don’t think I’ve ever seen it done quite like this either. I think that’s probably because it’s not really that good of an idea, if the vector can shrink, you really should store the capacity.

sparkie · 2026-06-13T08:07:24 1781338044

I don't think it's that rare. I've been using the technique for years, and I've seen it done in other work. Bagwell's VList[1] for example uses the equivalent of `bit_ceil` to determine the size of each block without having to store it - and there are earlier works based on the same trick. RAOTS, which is referenced by the VList, mentions using the technique, but itself uses a slightly more complex trick where we can calculate the size of a block based on the approx square root of the length.

You can use the trick if the array can shrink as long as you always shrink the allocation when length goes below the next power of 2 not greater than len (which may make use of stdc_bit_floor).

[1]:https://cl-pdx.com/static/techlists.pdf

entrope · 2026-06-13T10:06:19 1781345179

Ignoring multiple evaluation, one can also #define stdc_has_single_bit(X) !((X) & ((X)-1)). If X isn't a power of two, the -1 will leave the MSB in place.

sparkie · 2026-05-30T20:30:05 1780173005

Bitcoin has a variable width encoding (`CompactSize`), but it doesn't prevent overlong encodings - however there are various canonicalization rules in the Bitcoin protocol to require minimal encoding.

sparkie · 2026-05-02T01:50:03 1777686603

The C charter has a rule of "no invention".

Anything needs to be demonstrated and used in practice before being included in the standard. The standard is only meant to codify existing practices, not introduce new ideas.

It's up to compiler developers to ship first, standardize later.

thayne · 2026-05-02T02:00:53 1777687253

That produces a bit of a chicken and egg probablem for a stdlib overhaul. Compilers and libc implementations don't have a strong reason to implement safer APIs, because if it is non-standard then projects that want to be portable won't use it , but it won't get standardized unless they do add safer APIs.

So the best hope is probably for a third party library that has safet APIs to get popular enough that it becomes a de facto standard.

EPWN3D · 2026-05-02T05:26:32 1777699592

I think the real failing is that new language features then must be prototyped by people who have a background in compilers. That's a very small subset of the overall C community.

I don't have any clue how to patch clang's front end. I'm not a language or compiler person. I just want to make stuff better. There needs to be a playground for people like me, and hopefully lib0xc can be that playground.

jstimpfle · 2026-05-02T13:51:03 1777729863

By adding to the language itself, you mostly make stuff worse. The major reason why C is useful is its quite stable syntax and semantics. Language is typically not the area where you want to add code. It's much better (and much easier) to invent function APIs. See how they shake out, if they're good you might get some adoption.

sparkie · 2026-04-30T19:47:51 1777578471

Lisps are expression based languages, but not pure. It's easy to mistake it as "like most other languages", but it's not quite the same - everything is an expression and returns a result. There are no "statements".

They appear procedural because of syntax sugar - ie, the body of a function is basically implicitly wrapped in (progn ...), (begin ...), ($sequence ...), etc - which are all equivalent expression forms which evaluate their sub-expressions in order and return the result of the last one.

   (progn a b c)      ;; CommonLisp
   (begin a b c)      ;; Scheme
   ($sequence a b c)  ;; Kernel

   ;; evaluate a, then b, then c, 
   ;; ignore the results of evaluating a and b 
   ;; return the result of evaluating c.

So when you see:

   (define (foo)
       (expr1)
       (expr2)
       (expr3))

If we desugar, it would be

    (define foo (lambda () (begin (expr1) (expr3) (expr3))))

We get behavior that looks just like other procedural languages (without a "return" keyword) - but everything is still an expression.

A similarity is the comma operator in C. Imagine you didn't write statements but the body of your C functions was entirely chains of comma operators.

CommonLisp has a couple of other useful related forms - prog1 and prog2. They still evaluate their sub-expressions in order, but prog1 returns the result of evaluating the first expression, and prog2 returns the result of the second expression.

    (define (foo) (prog1 (expr1) (expr2) (expr3))
    (foo)
 
    ;; evaluates expr1, then expr2, then expr3
    ;; returns the result of evaluating expr1
    ;; ignores the results of evaluating expr2 and expr3

ux266478 · 2026-04-30T21:58:04 1777586284

I'm very familiar with Lisp, you don't have to explain it to me. I think you are mistaken as to the meaning of what is meant by "procedural language" here. It's simply the mode of computation where a program is directly conceived as a hierarchical sequence of steps. I think you got caught up in the idea of a grand disjunction of "expressions" and "statements", with a distinguishing feature involving return values, and so on. But no, that's not particularly relevant here (nor is it universally true, or applicable)

To simplify it, you can consider cons, car, cdr the beating heart of Lisp. These special forms directly encode the execution semantics as the traversal of a head over cells of a tape. Lisp belongs to the same family as the Turing machine, and that's a very big family. SML also belongs to this family. The overwhelming majority of programming languages belong to this family.

sparkie · 2026-04-30T03:37:49 1777520269

In Kernel I would use something like this:

    true        #t
    false       #f
    null        ()
    [...]       (& ...)
    "k" : v     (: k v)
    {...}       (@ ...)

Where &, :, @ are defined as:

    ($define! &
        ($lambda args (cons list args)))

    ($define! : 
        ($vau (key value) env
            (list key (eval value env))))
            
    ($define! @ 
        (wrap 
            ($vau kvpairs env 
                (eval (list* $bindings->environment kvpairs) env))))

Using the "person" example from the JSON/syntax section on Wikipedia:

    ($define! person
        (@
            (: first_name "John")
            (: last_name "Smith")
            (: is_alive #t)
            (: age 27)
            (: address 
                (@
                    (: street_address "21 2nd Street")
                    (: city "New York")
                    (: state "NY")
                    (: postal_code "10021-3100")))
            (: phone_numbers
                (& (@ (: type "home") (: number "212 555-1234"))
                   (@ (: type "office") (: number "646 555-4567"))))
            (: children
                (& "Catherine" "Thomas" "Trevor"))
            (: spouse ())))

I would then define `?`

    ($define! ? $remote-eval)

Now we can query the object.

    > (? age person)
    27

    > (? postal_code (? address person))
    "10021-3100"

    > (car (? children person))
    "Catherine"

    > (cdr (? children person))
    ("Thomas" "Trevor")

    > (? type (cadr (? phone_numbers person)))
    "office"

    > (? number (car (? phone_numbers person)))
    "212 555-1234"

    > ($define! full_name ($lambda (p) (string-append (? first_name p) " " (? last_name p))))
    > (full_name person)
    "John Smith"

anonzzzies · 2026-04-30T04:09:47 1777522187

[0] https://web.cs.wpi.edu/~jshutt/kernel.html

aidenn0 · 2026-04-30T14:30:30 1777559430

I don't know kernel very well; what will the value of person print as?

sparkie · 2026-04-30T15:04:10 1777561450

There's no "external representation" for environments. In klisp it will just print:

    [#environment]

The environment type is encapsulated, so it doesn't give you very useful debug information.

Perhaps having `@` produce an environment is the wrong approach and we should just produce an association list instead - then move `$bindings->environment` into the `?` operative to enable querying.