This is what 0-based indexing looks like in data analysis: >In order to read a c...

dangom · on Aug 9, 2018

This is very much a non-argument. Call the columns the 4th and 7th is as arbitrary as calling them the 3rd and the 6th.

Again, 0-based indexing exists to fit a purpose: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EW...

In my opinion, reading `a = b[1:n-1]` hurts much more than reading `a = b[:n]`.

coldtea · on Aug 9, 2018

>Call the columns the 4th and 7th is as arbitrary as calling them the 3rd and the 6th.

No, that's their ordinal position, and how 8 billion non-programmers would refer them as in any everyday setting.

That's also how programmers would refer to them if it wasn't for a historical accident.

What's more, that's also how programmers refer to them when they talk between then and not to the machine ("check the 3rd column" not "check the column at the index of 2").

Certhas · on Aug 9, 2018

It's not arbitrary. It's English. In the list [apple, orange, tree] which element is orange? It's the second element.

I have taught Python quite a bit, and I have gotten good at explaining 0 based indexing and slicing based on it. When I switched to Julia there was nothing to explain. And my code has about as many +/- 1s as before...

jacobolus · on Aug 9, 2018

Just because it is our common convention in lay conversation doesn’t mean it isn’t “arbitrary”.

These spoken language conventions developed before there was an established name for “zero” or even a concept that “nothing” could be a number per se.

For similar reasons, we have no zero cards in our decks, no zero faces on a dice, no zero hour on our clocks, no zero year in our calendar, no zeroth floors in our buildings, East Asian babies are born with age one year, etc.

It’s only by another set of historical accidents that we have a proper 0 in our method of writing down whole numbers. Thankfully that one was of obvious enough benefit that it became widely adopted.

FabHK · on Aug 10, 2018

> no zeroth floors in our buildings

In (North?) America. In Europe, there's a ground floor (zero), then first floor (1), etc. Basement is -1 (etc.).

A European friend of mine arrived at college in the USA, and was assigned a room on the first floor of the dorm. She then asked the housing office whether there was a lift, because she had quite some heavy luggage, earning some rather amused looks :-)

jacobolus · on Aug 10, 2018

Yes, it would be interesting to know the history.

Whoever designed the European convention for labeling building floors was numerically competent.

Too bad medieval European mathematicians and the designers of Fortran weren’t. ;-)

Certhas · on Aug 10, 2018

The base level doesn't need to have a floor, it's just ground. Once you add a floor you are on the first floor above ground. Really your condescending tone as if all the mathematicians that prefer to work with 1 indexing are just incompetent is grating.

I'm happy to be writing

``` for i in 1:n func!(a[i]) end ```

to iterate over an object of length n. Or split an array as a[1:m], b[m+1:n]. Slicing semantics which are far more prevalent in my code (and the code I read) than index arithmetic, are truly vastly simplified by 1 based indexing of Julia compared to the 0 based numpy conventions. We simply no longer code in the world that Dijkstra argued for, and I have not seen anybody give a clear argument that is actually rooted in maths and contemporary programming.

I genuinely thought that the Python convention was brilliant, and that 1-based indexing in Julia would suck. It turned out not to be the case.

jacobolus · on Aug 10, 2018

Sorry, that last bit of my comment was gratuitous.

I am legitimately (mildly) curious about the history of the different naming conventions for floors of buildings though.

> The base level doesn't need to have a floor, it's just ground. Once you add a floor you are on the first floor above ground.

Yes, my point is this is an example where the European 0-based indexing system makes more sense (in my opinion) than the American 1-based indexing system. I speculate that whoever started calling the ground floor the “first floor” hadn’t really put much thought into how well that would generalize to large buildings with many floors including some underground.

Similarly, whoever decided the calendar should start at year 1 AD with the prior year as 1 BC hadn’t really considered that it might be nice to do arithmetic with intervals of years across the boundary.

There are many standard mathematical formulas which are clarified by indexing from 0. But nobody can switch because the 1-indexed version is culturally fixed. Most of the rest of the time the 0-indexed vs. 1-indexed versions makes basically no difference. It is rare that the 1-indexed version is notably nicer.

> Or split an array as a[1:m], a[m+1:n]

Yes, I find it substantially clearer to write this split as a[:m], a[m:]. Particularly when dealing with e.g. parsing serialized structured data. But also when writing numerical code. Carrying the extra 1s around just adds clutter, and forces me to add and subtract 1s all over the place; reasoning about it adds mental overhead, and extra bugs sneak in. (At least when writing Matlab code; I haven’t spent much time with Julia.)

beojan · on Aug 10, 2018

> no zero hour on our clocks

There is. We call it 12 for some crazy reason (it goes 12 AM, 1 AM, 2 AM, ..., 11 AM, 12 PM, 1 PM, ...).

> no zero year in our calendar

Which is quite irritating really. New Year's Day 2000 wasn't the start of the 3rd millenium, because there was no year zero.

> East Asian babies are born with age one year

But not western babies.

jacobolus · on Aug 10, 2018

The 12 on a clock is a compromise to match between a 1-indexing oral culture and natural 0-indexing use case (which came from the Sumerians who had a better number system).

I don’t know the history of reported ages of Western babies.

> quite irritating really

Yes that is my point.

DNF2 · on Aug 9, 2018

The equivalent of `a = b[:n]` is `a = b[1:n]`. And I don't think you can get around admitting that there is a fundamental ambiguity in the spoken statement "Take a look at the fourth column!" in a zero-based index system. You always need a follow-up question to clarify whether you mean "everyday informal speech fourth" or "zero-index fourth."

beojan · on Aug 10, 2018

But you can say "Take a look at column 4" instead, which is unambiguous.

thousandautumns · on Aug 10, 2018

Calling the 3rd and 6th columns "3rd" and "6th" is hardly arbitrary.

e12e · on Aug 9, 2018

a = b[1..n]?