One big thing that's missing from this list is the __traceback__ on exceptions, which pretty much does what you think it does. In Python 2, there's no way to access the traceback for an exception once you've left the `except:` block. This matters when you're using things like gevent; if one of your gevent greenlets throws an exception and you inspect the .exception attribute on it, you'll be able to get the exception message but won't know what line it came from.
N.B. This is absent from Python 2 due to concerns with creating self-referential loops. The garbage collector got better in the meantime and the feature was never backported to Python 2.
Huh, I was almost sure you were wrong and that I'd done this before, but the frame does indeed "go away" once you leave the except block. Fascinating.
In [8]: try: crasher()
...: except TypeError as e:
...: exc = e
...:
In [9]: type(exc)
Out[9]: TypeError
In [10]: traceback.print_tb(exc)
AttributeError: 'exceptions.TypeError' object has no attribute 'tb_frame'
I can understand the concern, though it seems like there would be easier fixes than the garbage collector for that specific case.
I did not know you could append to a Path via "/", but that's really awesome! I also really love working with generators when I write Python. They are just such a simple idea that's very powerful and I miss them so much when I go back to javascript (I know javascript has them now, but I haven't written them, and they don't look as fluent as Python 3, where the large parts of the language design is based around them).
Overloading operators for cute purposes is usually a misfeature. There was a fad for this in the early C++ days. I once wrote a marshalling library which overloaded "||", so that you could write
p = stream || rec.a || rec.b || rec.c;
and get an object which, if written, marshalled the record, and if read, unmarshalled it. The marshalling order was only specified once. Cute, but in retrospect, bad.
Python's classic overload problem comes from using "+" for concatenate. For built-in vectors, "+" is concatenate, but for NumPy arrays, it's addition. You can use "+" between a NumPy array and a built-in vector. Does it add, or concatenate?
("|" is supposed to be the concatenate operator; it was added to ASCII to support PL/I. But decades of C have made it the bitwise OR operator to most programmers, so we're stuck. "_" is allowed within symbols, so that's taken. There's "⁀", character tie, but nobody can type it.)
It seems to be best to restrict the math operators to math.
> Overloading operators for cute purposes is usually a misfeature
Whether a function for which an operator is overloaded is “cute” or “fundamental to clarity” depends on the context in which it is used.
> There was a fad for this in the early C++ days.
Yeah, like overloading the bitwise shift operators as stream insertion/extraction operators.
> Python's classic overload problem comes from using "+" for concatenate. For built-in vectors, "+" is concatenate
Python doesn't have built-in vectors, it has lists.
> but for NumPy arrays, it's addition.
Not sure I see a problem. For numbers, it's an addition. For simple ordered collections, it's concatenation. For matrix-like collections, it's memberwise addition.
> You can use "+" between a NumPy array and a built-in vector. Does it add, or concatenate?
To the extent that's a problem, it's a problem with implicit coercion more than operator overloading.
> Whether a function for which an operator is overloaded is “cute” or “fundamental to clarity” depends on the context in which it is used.
The fact that it requires context other than the language spec is where the problem resides. The cognitive overhead of reading code is high enough without having to alias common operators.
> The cognitive overhead of reading code is high enough without having to alias common operators.
The cognitive overhead of reading code is, IME, often lower with context-approprate operator overloading (just as it is with well-chosen vs. poorly-chosen method names.)
There is a reason human languages (and, yes, even the notation of mathematics) develops context-specific dialects, and it is to reduce the cognitive overhead of communication. Code switching with clear contextual boundaries is often far more efficient than using a one-size-fits-all notation that is context-insensitive.
I'm pleased to see this perspective represented; it's the main reason I still really love the Perl programming language. The idea that context is useful and can actually help with clarity is a core concept.
The problem is that a reasonable context response is different from person to person and culture to culture.
I feel like this makes things like using operator overloading to clean up code incredibly dangerous, especially in popular libraries.
Best: Context-appropriate overload that makes sense to me
Good: Standard language meaning that may not fit cleanly in context, but has the same definition always
Worst: Context-appropriate overload that doesn't make sense to me
So if you aren't sure whether or not your clever idea will actually fall in the third category for some people, go with the second instead of pushing for the first.
Aka, "for every story someone can tell of a magical operator overloading that makes code more readable, I can come up with some dumbass architect who did the most counterintuitive thing you can imagine."
The difference between an operator and an appropriately named function is small enough that in most (but not all, I'll give you that) cases overloaded operators tend to confuse more than they enlighten.
I've worked for a while on a C++ vector/matrix library which made good use of C++'s capabilities to overload operators. It all made sense, in the beginning. But as time wore on and more and more exceptions crept in you'd get very confusing bits of code where in the same fragment '+' would mean one thing, then another, ditto for '*' and endless frustration when no free operator could be overloaded and in the end you'd end up having to call a function anyway.
Maintaining that over the years got less and less pleasant, even if at the start it was probably some of the cleanest code for the limited purpose that it had.
Personally, I avoid operator overloading, just as I try to avoid reader macros and other clever bits. It's not that I don't know how to use them or how they work, it's that I love being able to look at a piece of code without also having to study the environment it operates in in order to know what it does.
KISS. Though, adding two vectors or multiplying a vector and a matrix with just an operator looks very good. Better than:
> But as time wore on and more and more exceptions crept in you'd get very confusing bits of code where in the same fragment '+' would mean one thing, then another, ditto for '*' and endless frustration when no free operator could be overloaded and in the end you'd end up having to call a function anyway.
The problem here seems to me to be “having too small a space of valid operators in the host language for the problem domain” more than “operator overloading”, as such. But it's true that the space of available operators is a factor in whether you should use operators over functions for a particular use, because it definitely impacts clarity.
IMO, ideally a language should support a wide range of operators, a d programmers should be very selective in how they use them; additionally, overloaded operators in libraries intended for reuse should usually have equivalent text-named functions/methods, for when the library is used in contexts where using the overloaded operators would be confusing. (Ideally, you'd be able to import selectively so that you don't even bring the operators in scope where you don't want to use them.)
> additionally, overloaded operators in libraries intended for reuse should usually have equivalent text-named functions/methods, for when the library is used in contexts where using the overloaded operators would be confusing.
Ok, but in that case you need two systems to access the same code, that only adds to the confusion and really negates all upside from having the overloaded operator in the first place.
That violates DRY in a very ugly way.
Anyway, I think we can agree on one thing: operator overloading is something that one should not do just because it is possible, but only for very good reasons.
I don't think aliasing (or the moral equivalent, where operator vs. method/function can't be an alias from an implementation point of view) violates DRY. Presenting an alternative API for different use cases isn't repetition.
> Anyway, I think we can agree on one thing: operator overloading is something that one should not do just because it is possible, but only for very good reasons.
Another interesting example is in PyParsing, which can overload + and | and a few other things to produce BNF-style grammar parsers inside of Python code.
On the other hand, someone might just as well say “the cognitive overhead of reading domain-specific code is high enough without having to alias common operators”, where “common operators” here means within the domain, in the numpy example matrix algebra.
Yup, overloading existing operators for new meanings is smelly. It’s better if the language allows creating new operators, so you’re not stepping on someone else’s semantics. For example, Haskell has “</>” in the “filepath” package as an alias of “combine”. Custom operators can certainly be abused, but most packages that define operators only do so for good reason, and crucially they’re still searchable with Hoogle.
You can even set the precedence of your new operator in Haskell, and confuse everybody. Maybe beyond +, -, *, and /, you should have to use parentheses.
Here's C's operator precedence.[1] All 15 levels.
Think of the maintenance programmer who will have to fix this.
Eh, I’ve gotten somewhat confusing type errors from mixing precedence levels in Haskell, but no bugs that I can recall. I’d like it if the levels were named instead of numbered, though—I have a hunch that the majority of operators fall into a family like “additive”, “multiplicative”, &c.
People have experimented with relative precedence levels (and I believe Perl6 does this?) where you can say “this is lower-precedence than that” and you have to use parentheses to disambiguate expressions that mix operators unrelated by that partial order.
Another approach taken in some programming languages is to dispense with operators entirely and just use functions, with fairly loose restrictions on function names. For example, in principle there's no reason a function couldn't be named "+" or "|". Obviously that can lead to readability problems.
And then in some of those languages they let you create macros so in the cases where the normal function syntax looks ugly you can make it readable with no cost... But with macros comes the same sort of potential abuse as other operator overloading, so it's a tradeoff.
On the opposite end of the spectrum, you have the likes of Perl where you get all sorts of specialized operators to avoid such ambiguities, at the expense of having to learn a whole bunch of operators.
Anybody who likes that will like subnettree, too, which offers a C-based Patricia tree for IP addresses. In the past, I've gotten a lot of mileage out of this library:
How old is Python 3 now? I've always used Python for a "miscellaneous task" language, and still do... and even I find "...because you refuse to upgrade" a bit insulting. If I used it for something serious, even more so.
The way 2.x -> 3.x was handled is/was/will is an absolute disaster. Upgrading simple scripts is a non-issue. Larger projects seem to always be a horrible pain.
I used to think like you, but when i started actually using Python 3 it all was a lot less worse than i've expected it to be from reading Hacker News comments. YMMV, but converting existing code for me usually didn't amount to more than adding parentheses around print statements. The proper support for unicode alone was enough of a reason to upgrade for me.
The bigger issues of upgrading I've run into:
* Systems generally still ship with 2.7x
* Dependencies that also have not upgrades (e.g. 3.x doesn't exist) or require a complete re-work now
* Searching for help on X often results in 2.7x help
I've upgraded projects, but this "you refuse to upgrade" business bothers me. There is a reason people haven't: The way Python handled all of this is horrid.
It's often easier to just move to another language.
Leaving out third-party open-source projects I contribute to, and projects I've been involved with as part of my work (both open and private), and just focusing on code from my own personal Python packages:
By count of lines, 0.04% of Python code I've published exists only to deal with 2 vs. 3 issues. Of that code, 34% consists of importing and applying a decorator from Django that handles setting up either __str__() or __unicode__() on a Django model as appropriate. Tellingly, there are exactly two lines I could find that aren't that decorator and which deal with a str/unicode issue.
And that's in code which doesn't just run on Python 3, it supports 2 and 3 in the same codebase.
You're not. von Rossum has to maintain Python 2.7 as part of his day job.
Things I still have in Python 2.7:
- Code that runs on low-end shared hosting. There's a Python 3, but it's 3.2, the least-compatible version. No "six", and "u'xyz' isn't allowed.
- ROS, the Robot Operating System. Python 3 support is coming, but it's not really here yet.
I converted over the dedicated servers and some IoT stuff a year ago.
The types stuff probably should have been called Python 4, not Python 3.6. Those are major language changes. The optional-types-that-are-not-checked thing seems a bad idea. I can see having checked types; that allows some important optimizations. When you know something is a machine type, such as an int or float, you don't have to box it. But the feature of type declarations without checking is a foot gun.
Yes. My reason is that I simply don't really care that much. It's the default on all my systems, so why bother putting in the effort to use something that doesn't appear to offer much advantage to me personally? I use python for my personal research, so my particularly situation affords me this laziness.
edit: I don't mean this as a knock on py3. All I'm saying is that my situation and uses for python allow me to be apathetic towards 2 vs. 3.
I'm in similar situation wrt what I use python for, and suffered from the same apathy until quite recently. Man, it was really easy to switch, and completely worth it just for the little things (like UTF-8 as default encoding for strings). It's really painless, but if you want to ease into it, I recommend using __future__ in your py2 stuff for now just to get you into a py3 state of mind: https://docs.python.org/2/library/__future__.html
UTF-8 isn't CPython's default encoding for strings. The internal encodings are ASCII, UTF-16, or UTF-32, depending on the widest character in the string.
I would have made UTF-8 the internal representation, and generated an array of subscript indices only when someone random-accessed a string. If the string is being processed sequentially, as with
for c in s :
...
you don't need an index array. You don't need them for list comprehensions or regular expressions. One could even have opaque string indices, returned by search methods, which don't require an index array unless forcibly converted to an integer. Some special cases, such as "s[-1]", don't really need an index array either.
Go uses UTF-8 internally, but you index a Go string by bytes, not glyphs or graphemes. The "index" function on strings in Go returns an integer, not an opaque index object or a slice. You can create a slice of a UTF-8 string which is misaligned and not valid UTF-8.
Indexing a string in Python 3 returns glyphs (which are strings), not bytes. Not graphemes; if you index through a Python 3 string with emoji that have skin color modifications, you'll get the emoji glyph and the skin color modifier as separate items.
Python 3 also has a type "bytes", which can't quite decide whether it's a string type or an array of integers. Print a "bytes" type, and it's printed as a string, with a 'b" in front of it, not as an array of integers. But an element of a "bytes" array is an int, not a bytes type. This is for backwards compatibility; it's roughly compatible with legacy "str" from Python 2.
Rust struggles with this. Rust tries to prevent you from getting a invalid UTF-8 string. The solution used involves re-checking for UTF-8 validity a lot, or bypassing it with unsafe code. This gets messy.
> I use python for my personal research, so my particularly situation affords me this laziness.
I am pretty much in the same situation and in hindsight I kinda wish I upgraded a little sooner. Fixing print brackets was the biggest (not that big) task, and from there on it was all python3 sugar for me :)
Most people who use a language are actually working on code that existed before yesterday. I work on a codebase that was written in python 2, and when we looked at the cost of upgrading it to python 3 versus adding new features, it was a no brainer. I have no desire to switch to python 3, nor do I anticipate ever doing so, at least for the projects I'm working on now.
If you are maintaining a large or important codebase rather than working as a hobbyist or independent contractor, then there is nothing wrong with using older languages/platforms, especially if they are stable and well tested. Avoiding surprises or breakage is generally a lot more important to you than getting a cool new list comprehension facility. Actually, stability of the language is something to be desired, as there is a cognitive cost to having different portions of the project use different language constructs.
In terms of python 2 not being supported in the future, if we are forced to stop using python 2, we'll probably need to switch the project over to Java. Not something I'm looking forward to, but at least the java community doesn't force developers to rewrite their source code when a new JVM comes out. There were some examples of older bytecode not working in newer JVMs, but as long as you had the original sources you could always compile to the newer bytecode. Personally, I've grabbed jars from 2005 and used them without any issue. My employer is mostly a java shop, and I already have to occasionally answer questions from other devs about why this project is in python. My answer has always been that python is a more productive language for this use case, and their retort is that it's not really enterprise ready -- meaning things like stability and support, so that while the language may be faster to initially develop in, the long term maintenance cost will be higher. The lack of respect for backwards compatibility as well as the hostility of the community to basic things like don't break working code is causing me to lose some conviction in my side of this argument. I imagine the same discussion is happening in businesses all over the country.
The question is not python 2 or python 3, but python 2 or move away from the language to one that understands my needs. This transition has created a real black eye for python and its role in the commercial space -- at least that's my impression.
>if performance is a problem, then start rewriting the critical pieces right now because python 2 also isn't for you.
You say it as if that would be a strange position to be it, but it's a common situation. A lot of time you start with the faster to prototype / more familiar language, and outgrow it.
That's what companies do after they grow so much that a language such as Python/Ruby/PHP is not doing it for them anymore or wont be doing it soon with their growth trajectory.
And you don't have to be Twitter scale either.
Even if you have moderate growth, you might find that with a different language, you can use 1/10 the servers, and thus drastically lower your operating expenses.
So, when some of those companies face the jump to Python 3, and the required rewriting, they often decide to bite the bullet, and do a fuller rewriting in another language (like Java, but Golang is also getting many Python converts, including e.g. Dropbox IIRC), that will give them much more bang for their buck.
It has e.g. a proper Union type, generics and type inference. Lack of pattern matching hurts, but then, Java doesn't have that either. You can also configure how strict you want the checks to be.
There are downsides though: type stubs are of... varying... quality, there are ugly hacks needed to resolve cyclic imports and the type annotation syntax gets yucky in places, even in 3.6.
I'm using mypy seriously for about a year and it's progressing very well. I see very good ROI even with the uglification of the code.
As an independent contractor maintaining large Python 2 codebases, your comments really hit home for me. I only have a portion of my time each month to maintain and develop features for them. If the Python 2 floodgate ever breaks, I am more likely to rewrite the system in a statically typed language. Even today, I am tempted.
I know that a sibling mentions the ridiculousness of the situation, but a forced depreciation of the Python 2 runtime (see also, Windows) is a billable, justifiable, reason to do the work of refactoring parts of the system. I do not expect those billable hours to happen in the next decade, however.
This. I don't think people realize that there are fixed costs that are born in making these kinds of migrations so that migrating to Java from python is not 100x more work than migrating from python 2 to python 3. There is no such thing as a small breaking change to a runtime.
Maybe one way of thinking about it is to take a hard drive full of data and randomly flip a few bits. Then ask what the effort is to find and fix all the errors, and whether this effort is a function of the number of bits flipped or the size of the drive. The people who don't understand/have sympathy with my concerns are saying -- it will only introduce a few breaking changes -- but I'm looking at the size of the project.
But this condescending attitude coming from some in the python community really isn't helping the language any. I am not saying that everyone has to share my concerns, but they certainly aren't "nonsense" or "ridiculous".
Let me get this straight: your coworkers - Java devs in a "Java shop" - tell you that Python isn't "enterprise-ready" even though your project is written in Python and has been successfully doing whatever it does for many years. I refer to this dismissive sentiment by Java devs as "nonsense" and you interpret my comment as condescension by the Python community?
I'm sorry to be the one to break this to you friend, but you must be suffering from the Java shop version of Stockholm Syndrome.
Sure, we have tests, we have a whole ecosystem -- written in python2 -- but that's not a justification to introduce breaking changes or a guarantee that the tests will catch all regressions. We also have dependencies. Moreover, unless your system is a toy, tests aren't going to give you the same guarantees as having a system be hammered by real world events for 5 years.
I really hope someone makes this work. I literally laughed and thought that it was a troll the first time I saw the future print import and that being used as a compelling reason that I should switch to Py3.
"11. There should be one-- and preferably only one --obvious way to do it." from the Zen of Python. Somebody intentionally misread it as, "There should be only one way to do it." They obviously made a grave mistake. Same with many 2->3 changes.
Also from the Zen of Python: "Special cases aren't special enough to break the rules", the old print statement contains a lot of special syntax not found anywhere else in the language. Also "Readability counts", it's really obvious what print("foo", file=sys.stderr, end="") does instead of deciphering a construct like print >>sys.stderr, "foo", . Moreover you can pass a print function around, perform partial application, etc., a statement does not give you that.
You prove nothing about why the two cannot co-exist in a non-regressive way. 'print' as a function already exists in Python 2, providing all the benefits you describe.
I stand very much corrected on the issue of 'print' and retract the above -- it only looks like a function syntactically in special (albeit, the most common) cases; PEP 3105 makes good points on how Python's parsing of whitespaces and parens renders a non-regressive parser out of reach. RTFM, I guess.
I do, but I understand the reasoning behind why it was changed. So I made a snippet in my editor and moved on with my life, haha. New print is now easier to to type than the old.
* Major APIs now return iterators or views instead of simple lists. This is rarely justified introduces unnecessary complication in many cases. Everybody knows list, why not keep it simple. I can't count how often I list() all the things just to get things going and because I didn't bother to look up the API to check what types are returned.
* I have to convert a lot of clear text data formats and needing to use 'print(x, end=" ")' instead of a simple 'print x,' is really cumbersome. I think printing something is absolutely substantial and it is justified for "print" be a statement.
* There is a distinct performance loss if you directly compare 3.x to 2.x at least for all my use cases.
* The sudden harsh break of backwards compatibility was completely unnecessary and stupid. Why not introduce new features slow and mark old features as deprecated or allow to specify the version in the header. There are a lot of ways one could handle this better.
* Python 3.x may be a marginally better (e.g more consitent) language than 2.x. For me it only introduces inconveniences. It takes a huge amount of effort to port code from 2.x to 3.x (if you want to keep it clean and readable). The practical benefits are close to zero (and you even loose performance). This is why the community seems to agree to stay with 2.x for a very long time.
> The forced unicode support is stupid for all my use cases.
bytes and bytearray are still there … if you don't want to work with unicode data in a proper Unicode type, you don't have to. But it doesn't follow that the rest of us that do want a good type for text shouldn't be able to have a good type / tooling for that.
(If you mean that functions that take text require text as input now, well, yeah.)
> Also Unicode in 3.x support is rather flawed
The entirety of "Internal Representation" is out of date, and no longer correct. For just about every other section, I believe libraries readily exist in PyPI to solve those problems. I do agree that it would be nice to have some of that closer to the standard library.
> I have to convert a lot of clear text data formats
I find it strange that you work with text formats, but you dislike having a proper text type? Even if you data were mostly or all ASCII, Python's text handling would still be mostly transparent, just silently doing the right thing if non-ASCII ever were encountered.
> I think printing something is absolutely substantial and it is justified for "print" be a statement.
I'm going to disagree. Python 2's print statements special syntax just adds cognitive load to reading, and to newcomers encountering such syntax, that just doesn't need to be there. Further, the trailing comma for "no EOL" is too subtle from a readability perspective.
> Major APIs now return iterators or views instead of simple lists. This is rarely justified introduces unnecessary complication in many cases.
The old APIs forced materialization of an iterable to a list, and there are plenty of circumstances where this just isn't required. This results in higher memory usage, for that list. The list() also makes it explicit where such materialization happen.
You can always get a list from an API/function that uses generators. You can't go the other way.
I cannot imagine any cases where a general python 2 answer isn't applicable to python 3, with at most a minor change.
In my experience, the most you are doing is shifting around imports (urllib, for instance), declaring classes slightly differently, handling string encoding differently, or printing, format-ing, and raising exceptions slightly differently (and in a more sensible syntax for all, imo).
On the contrary, there are new modules that you are missing out on and you will regularly run into "it's in the stdlib" answers that are not applicable to you.
I did for awhile, until making the resolution this year to use 3.6+ whenever and wherever possible. So far it turns out, I had very few incompatibilities.
Thanks for the suggestion. I've briefly tried Anaconda before, unfortunately can't remember why I didn't continue with it.
I just tried again and saw an error similar to what I get when I try with pip+brew - ModuleNotFoundError: No module named 'PyQt4'.
Note, this is after spending quite a while installing/removing/upgrading various qt packages and dependencies to try to resolve this. The system might be in a weird state because of that - but I originally started from scratch and only followed the instructions I found on the matplotlib site...
I would recommend you to try the intel python distribution. Comes with highly optimized numpy, scipy, matplotlib, pandas and so on. Everything compiled with icc and linked with MKL (intels BLAS library).
The important stuff that makes a good case for Python 3:
- Adittion of "yield from" allows easier programming with async I/O "a la " Node.js (using 'await')
- Standarized annotations of function arguments and return values can help in the future for type checking, optimization, etc.
Even more important stuff
- Unicode can be used in symbols. You can now use Kanji characters in your function names, to annoy your coworkers and win the International Obfuscated Python Code Contest.
Other stuff
- Minor unimportant stuff that is definitely no reason alone for switching for Python 3.
Of those, only the async I/O stuff seems compelling. But compelling it is, at least as used in Curio. It feels like this is still shaking out, with the standard library and Trio (?) alternatives, but it looks really cool.
I've come into large python2 projects which had been started with non-unicode strings (because the initial developers didn't think about it). At some point a user with non-English characters invariably signs up and then shortly complains. It has been significant work to (1) convert everything that should be converted to unicode (2) re-train the developers to use the unicode syntax.
Python 3 has, more or less, just renamed unicode() to str() and str() to bytes(). unicode() support was already complete in Python 2. The rename is not a user-facing feature.
True. It is a nice thing for scientific code though. Often in a field α etc etc have a known meaning by convention and being able to write them as α rather than alpha can really make longer formulas more readable.
But I just wanted to point out that the title is a bit presumptuous. I don't refuse to upgrade to Python 3, it's that the default Python for most distributions is 2 (sometimes as far back as 2.6). If you want to write a user-space tool with Python you can either require additional dependency setup, bundle a full interpreter with your package, or just write Python 2.7/6 code that is forward compatible with Python 3...in which case I still can't use the new features of 3.
At the end of the day, the continued slow adoption of Python 3 today is because ecosystems move slowly. Not to mention the original releases of Python 3 were really rough around the edges (such as being slower than Python 2.7 until ~3.4) which definitely contributed to the slow adoption in the early years.
What I said was the default version of Python is old on most distributions.
For example, CentOS 6 ships with Python 2.6 out of the box. Even faster moving distros (like Ubuntu) are still trying to make Python 3 the default (although at least Ubuntu ships with both 2.7 and 3.4+ out of the box).
No one should be using CentOS 6 in 2017, if you think python is your problem and you're running CentOS 6 you need to take a good hard look at your platform.
You think someone is going to start a new project and use a 7 year old OS? I don't think it's "unfortunately my problem" when biz/client reqs for a 7 yo OS are completely out of my hands, unless of course the mindset is "use Python 3 or die trying".
That's true, but many users are still affected by that Python 2 remains the default. Any user can install a tool for Python 2 with pip install --user but the user might need root privileges to install a tool for Python 3 if the system administrator never intentionally installed it.
Furthermore, Python 2.x support will expire in 3 years. It's a lot less effort to install 3.x and use it now than plan, implement, and release a 3.x upgrade in 3 years.
They have been pushing support back for 2 for a long while.
But I agree with you:
> or just write Python 2.7/6 code that is forward compatible with Python 3...in which case I still can't use the new features of 3.
Although, personally, I think the best thing to do is write forward compatible Python 2 code (or at least as much as you can). At the very least write Python 2 code that lends itself well to using a migration tool like 2-to-3.
Ubuntu 18.04 LTS plans on Python 3-only, but Ubuntu has been shipping Python 3.x since 13.04 and 3.x is the default with 2.x available since 16.04 , so in terms of shipping Python 3 code, Ubuntu 12.04 LTS (the last LTS where you can't assume it has a python 3) EOL'd on April 28 2017. Assuming $WORK follows EOL depredations, you should be fine for Ubuntu.
Using the arrows keys for the slides has been a standard since the dawn of web presentations, a decade or so ago.
Doesn't have to be more discoverable imho. As soon one finds it out, they can use the knowledge for any other presentation they chance upon -- and presentations remain clutter free, without prompts and extra navigational buttons for the rest.
Perhaps this comment has something to do with that burial: "Not going to lie to you. I still don't get this." Personally, the most confusing thing about that code is that apparently return will wait on a yield from, which seems odd.
Python 3.5+ supports the async/await syntax which is much easier to follow than yield from. I fully expected async/await to be item #1 of a list of Python 3 killer features over list comprehension doodads; it's surprising that it's entirely missing from this list!
It was source-level incompatible with older code so it couldn't be used as a smooth transition like... almost any other language upgrade I've ever seen.
Breaking code like this was a big mistake in my opinion - it resulted in many people sticking on the old version for code and library compatibility. There are still libraries which don't work on Python 3, though now fairly few of them.
In addition to that, the unicode changes were in my opinion another mistake - they made it far more complicated to deal with strings and byte arrays as now even for the simplest of applications you have to put thought into character encodings. Another annoyance for me personally is that I often use python to do things like hex and base64 encoding, the removal of .encode("hex") really wrecked this one for me. Now I have to remember weird import libraries like "binascii" with weird function names like "hexlify".
Ultimately Python 2 works. And it works well, there just aren't that many compelling reasons to move to 3 given that it often is more complicated and breaks existing code. They have a few cool things now, so I sometimes use it for new code but there's still nothing that's making me want to delve into a port of my older codebases.
I strongly disagree with respect to unicode changes. From my perspective, where 95% of my Python is Python3+Twisted implementing obscure and obtuse protocols, the strong firewall between byte strings and character strings plays a key role in my keeping my sanity. Unicode + byte strings alone are worth moving to Python3. I also make heavy use of other new features, I can't imagine going back to Python2.
Most Python 2 codebases probably have a lot of unknown bugs in them related to Unicode. The code will work fine for cases where the inputs are ASCII, but will subtly break if unicode inputs are provided.
In term's of code reliability, Python 3's approach is much more sane.
I work at a company that uses Python 2 in older projects and Python 3 in newer projects, and this 100% matches my experience. Python 2's string implementation is simply busted, and the benefit you get in exchange for that brokenness is that some cases are slightly shorter to write.
> It was source-level incompatible with older code so it couldn't be used as a smooth transition like... almost any other language upgrade I've ever seen.
Ruby released the source-incompatible Ruby 1.9 just months before Python 3. IIRC most versions of Swift have been source-incompatible with each other. Even a language as buttoned-down as C++ has made source-incompatible changes[1]. I have no idea where people got the notion that Python invented the idea of breaking changes, but it just isn't so.
I don't know how Ruby handled it, but Swift and C++ have the advantage of being a static typed language, and most breaking changes can be detected by the compilers. When you transition from Python 2 to 3, you never know which line will explode when you run it.
Go ahead and look at those breaking changes, then compare with Python 2 to 3's breaking changes.
It's a difference of a few minor things versus a whole swath of changes to the handling of strings in the language, heck, a python 2 print statement isn't compatible with python 3.
These are much more significant breaking changes which make porting code much harder than any of the other examples you've given. For the most part, developers don't have to touch or only have to make minor changes for specific cases to port to Ruby 1.9 or C++11 - look at those C++11 changes for example, so few people were doing those things and they're such bad practices that it simply didn't matter. They successfully preserved compatibility in the nominal case. But even following all the best practices in your Python 2 code will not make it run under Python 3.
I don't use Swift or C++ much, but I have several years of experience in both Ruby and Python. The changes in Python might be a bit bigger, but the changes in Python 3 and Ruby 1.9 seem of roughly the same magnitude to me. Text encoding is now a problem you need to think about, some methods were removed, other methods were added, the output of some methods changed (e.g. the output of Array.to_s, the result of indexing a string), a few syntactic constructs changed (e.g. character literals used to return numbers, now return strings), and all C-based libraries broke in Ruby 1.9. In both cases, more than 99% of your code will probably be the same (unless your program is mostly print statements, I guess). What's more, Python took more pains than Ruby to ease the upgrade process, with stuff like __future__ imports, 2to3, and the six library. IIRC the Ruby team just changed stuff and said, "OK, update your code accordingly and we promise not to do this to you again for a long time."
It seems to me like the primary difference that made the Python upgrade worse is that the authors of the big libraries in Ruby were enthusiastic about moving to Ruby 1.9, so that it was possible to start porting most Ruby apps very soon after 1.9 released, while a few big Python library authors dragged their feet on even getting started. This was a very real problem, but it was essentially a cultural problem of the library landscape being very different, not a matter of the language itself being drastically different.
Drastically changing the behavior of the most commonly used internal type (string) is a really hard breaking change. Not disputing the why, but for some code, that's a huge change. One that automated tools like 2to3 don't help with.
Yeah, I know about that, but that's still longer and more annoying than the old way. Maybe I should just make my own library that just modifies strings and byte arrays and adds a few things for the common binary operations I do.
A combination of Unicode strings breaking a lot of libraries and Py3K not having a lot of user-visible improvements early in its lifetime. It built up a base of people who said "they broke all my code and I don't see any reason to fix it". A lot of people recommended (legitimately) that you stick with Python 2 until Py3K was more widely supported. The momentum from this group persisted even after the point where it made sense to switch to 3.
And so we are stuck, especially in enterprise where it could cost a lot of time and money to port because the transition wasn't managed well, both by users and the Python team.
Because after nearly 10 years this mediocre list is the best they can offer as consolidation for breaking your code, and for much of that time they didn't have everything it does now. If py3k shipped with something like https://gregoryszorc.com/blog/2017/03/13/from-__past__-impor... there probably would have been more uptake.
This piece barely scratches the surface, there's formatted string literals, type checking available, and compact ordered dicts by default in the latest. All awesomeness.
I'm glad they did the "mediocre" fixes as well, they lead to a lot fewer encoding (and other) bugs, which were a real problem on nontrivial programs.
Type checking is given in the piece. Compact dicts are nice for memory performance, but I don't use Python for performance. Ordered dicts I only care about rarely and explicitly use one in such cases. Formatted string literals have not really been a "feature" but a concern. The original plan was to get rid of the % way of formatting and just use .format with __format__ (except now there are pitfalls and inconsistencies with different types of strings). Now Python 3.6 has string interpolation with f-strings, which is kind of nice, but I've never really missed it when I come back from using other languages with it, and if I really wanted it there has been tools like Interpy. That last bit is kind of the nail in the coffin, a lot of Py3 features have been available as Py2 libraries or higher source parsing tools longer than the feature has been in Py3.
I didn't like how the changes to strings and bytes were made, but that's probably a larger discussion, and fits within how Py3 has evolved. If anyone is really confused by the struggle with adoption, just look through the release notes of version by version, and ask at what point there's a compelling reason to upgrade. I still don't see one, but at this point, nearly 10 years after Py3k, Py3.6 is finally usable enough that I wouldn't mind as much if I had to work on a Py3.6 codebase instead of a Py2.7 one.
One of the chapters added in the second edition of Peopleware quotes from a presentation by Steve McMenamin: "People hate change ... and that’s because people hate change ... I want to be sure that you get my point. People really hate change. They really, really do."
It wasn't broke and they fixed it. Python 2 is a mature language with a robust ecosystem. Those just don't go away no matter how much developers nag people to upgrade.
Except it was, and badly. Unicode support reached sanity matching Java or C#, if only just, but for me the most important fix was making exceptions actually work instead of being a broken pile of workarounds glued together with ambiguous and/or confusing syntax.
For some definition of "badly" that doesn't include a huge community of working software I guess. My point wasn't that Python 3 isn't better (or even "better enough") it's that no matter how much better Your Favorite Different Thing is, it will never be better enough to make people stop using their working software.
what is the split that has to do with 2vs3? Really it just seems like some devs are still griping about fairly minor backwards compatibility issues. the whole print vs print() is obviously a straw man. but the major one, strings, unicode, and bytes, changes for the good in py3. the fact that people have to go back and "un-hack" their code is time, effort, and more commits but it's for the best.
<rant>
i think the real split is between CPython + C-Extensions and the JIT (PyPy,Numba) gang for heavy lifting in python. I for one wish Guido would embrace the effort of integrating the GILectomy into CPython and make python thread parrallism real for CPU bound work as opposed to the high overhead multiprocessing approach.
Also, including mypy in the stdlib to make genuine optional static typing would make python a defacto standard for much of what it does now without the "we love python but...." production code pain points.
</rant>
I don't think the whole print vs print() problem is a straw man (or similarly the division operator switch) apart from being relatively easy to fix or avoid by importing from __future__, but crucial in understanding the attitude the language owners have for its users. They could have kept print and introduced printf(), but they didn't. They didn't even get rid of the statement vs. expression distinction, which would have been a compelling reason to break print (and would give us real lambdas).
The split you mention is real, but it's existed even longer than the 2vs3 split. All the various alternate Python implementations that have come and gone or are still around with the exception of Numba were started pre-3k. I wouldn't even argue that upstream was necessarily wrong not to prioritize performance and more specific use cases (like certain production uses, or scientific uses (a lot of begging to finally get the @ operator)), it just didn't matter as much for a long time. But things are different now. There are many other very expressive and much faster (either dynamically typed or static) languages in competition that lessen Python's effectiveness at reducing dev time in exchange for more hardware. Without a more serious focus on performance, Python will be driven only by momentum. That can last a long time, and is essentially the end point anyway when performance is "good enough" compared to the close alternatives, but it's not a great look when there's still much that could be done.
So, you see, I know a bit about writing efficient code. I have held a firm belief, and still do, that if you're doing heavy lifting in CPython, you're doing it wrong, and better parallelism will not significantly help you.
Glad to see the Enum class made the list. We've put them to good use in our codebase, especially the Flags enum variant which gives as a powerful summary feature to annotated data samples with all the sorts of flagging information we want to tack around it.
The slides on chained exceptions (feature 3) are missing one thing, I think. The "raise from" example is not really a way to "do this manually", but rather a way make explicit the relation between the two chained exceptions (as can be seen in the traceback messages), which is quite helpful.
The first example says that one error occurred, we tried to handle it, but then another error occurred in the handling. E.g. "Failed to find eggs in refrigerator. Tried to buy eggs, but tripped and broke leg."
The syntax in second example should be used when the exception in turn causes a larger process to fail, e.g. "Failed to make pancakes due to a failure to find eggs in refrigerator."
I still don't think that any of those new functions justifies the need of a total compatibility breakdown, like the one that was artificially induced from python2.7 to python3.
Python3 is good, but should have happened as a smooth transition from python2.7. The way it was handled was just a mess, and still keeps polluting the Python world.
Next time somebody asks what Java has over Python... here it is: nothing like the python 2 vs 3 mess.
I'd like to respectfully disagree. I don't believe that there is a way to change how one of the fundamental types of a language works (string/unicode vs bytes/string) and make that a "smooth" transition. I know that is causes a lot of strain in the python community, but I applaud the core developers for doing that change, instead of getting stuck in the well-known "we have to support all legacy versions" hell.
Honestly, if they had just keep the print statement, the push back would have been almost non-existent.
It was that one change that pushed python 3 from "some programs and libraries will need to be written" to "almost every single program and intro to python tutorial needs to change".
That’s a giant can of worms. One example problem: Python 2 has a type of class (old-style) that doesn’t exist in Python 3. What happens when you try to pass this type of class or one of its instances between the languages? There’s a bunch of stuff like that. You’d end up with a very weighed-down interpreter with a bunch of caveats if you ended up with one at all.
Would doing that then deprecating those features after 5 years or so not be better than what's happened with the Python 3 transition? I'm curious what options there are for how to make such a transition smoother.
The important part of Python 3 is the bytes/str/unicode rework, and there’s no way to make that not a breaking change. Everything else is minor stuff that’s allowed to be breaking because the str change already broke backwards compatibility.
(And yes, making strings text rather than bytes was really that important. Python 3 would be a huge improvement even if it were only that and some other safety things like removing the default ordering of incompatible types.)
The new keyword-only arguments look great, but it looks like it relies on adding a " * " parameter that allows any number of arguments. What if I want the safety of keyword-only arguments, but I don't want varargs? Is there a way to do that?
Almost every time I see something introduced in Python 3 I have an impression
that current Python envies Perl its baroque semantics, except that Perl
usually tries to guess what the programmer meant, while with Python the
relationship is reversed: the programmer needs to guess what the language
wants.
Python's main selling point was simplicity of syntax and semantics that
preserved its high-level language status. This simplicity is no longer present
in 3.x line.
Well, try this for me, especially that you didn't use searchable terminology,
so it's hard to confirm what you're saying. And then tell how much of that is
important for typical code.
Python 2 has old-style classes, new-style classes, functions, bound methods, unbound methods, slot wrappers, descriptors, method wrappers, built-ins, ... and I'm rather certain I forgot some. All kinda callable types I mentioned behave differently. This makes the language complicated; and due to the dynamic nature leads to subtle bugs that are only detectable at runtime. Python 2 is a lot like C there. This isn't limited to callables but is spread around the language in a lot of places. Python (both 2 and 3) is not a simple language. It has a supremely complicated data model whose documentation is insufficient and scattered.
Ruby's 1.8 to 1.9 transition seemed to go much smoother - I'm curious what the difference is. Just down to what is essentially better source comparability I guess.
Ruby 1.8 to 1.9 was a similar mess for a shorter period of time, and not because it was handled better, but because there was less diverse entrenched code and subcommunities to deal with: when Rails and popular Rails libraries worked, most of the community could move; Python had a whole lot more that had to be addressed, the curse of it's broader success.
(Also, being already expression-oriented, Ruby didn't have to deal with running into problems where a core feature was a statement that couldn't be used in contexts limited to expressions; fixing that is inherently painful.)
My guess is it could also have to do with the communities. Anecdotally, the limited sample of ruby-istas I have interacted with tend to have more tolerance of churn for improvement's sake whereas the people attracted to python are more interested in stability because they are focused on some other need that the code is mearly an ends to.
And/or because many Python projects have completely insufficient testing which means that any change that might break anything anywhere can go undetected.
Ruby managed to get their major libraries ported to 1.9 much much quicker. RoR basically worked from day 1. Python 3 was without both numpy as django for well over a year. Also the first few releases of python 3 weren't very good. It wasn't before 3.2 before you could really use python 3 in production and not before 3.5 before most of the interesting features where in place. Basically a lot of early momentum was lost due to this and and it never really regained it.
Just wondering but what should you do if you decide to go with Python 3 and find a library you want to use that isn't compatible and you are short on time?
That fear held me back 7 years ago when I started learning Python. In 2017, having worked in many projects, what I have to tell you is this: I never found such case where Python 3 wasn't supported, and I should've learned Py3 to begin with. YMMV, but the fear was unfounded back in 2010, and even more so today.
There are very few projects without Py3 support, and most you'll find without Py3 support is because the project has been dead for quite some time.
At this point in time, if you want to use a library and it isn't compatible with py3, then you probably shouldn't use that library. Essentially all actively maintained libraries are converted, and we are even starting to see libs that do not support py2.
As a counterexample, there's ROS (Robot Operating System, which is not an OS actually but a middleware plus a lot of libraries), and it still doesn't support Py3 officially. There is some support in fact, but you probably don't want to have even more potential problems with ROS for not that big of a benefit.
Nowadays it's more likely that a Python 2 user wants to use a library and finds it doesn't work on Python 2, because the developers never targeted nor tested on 2.
Probably because there was a lot less Swift code than Python in the wild at the respective times each came out with the new version, so having everyone upgrade all the old, incompatible code was less daunting.
I'll never understand those who break the internet on purpose and then complain that nothing works, and that the people actually spending money and doing work should cater to their niche tastes.
I'll never understand those who cannot serve a single page of static content without adding a bazillion of gratuitous hard dependencies on JavaScript libraries.
Maybe if more people resisted the temptation to use the latest and shiniest frameworks where it isn't warranted, I wouldn't have to upgrade my otherwise perfectly functional hardware every five years because it just can't handle all of today's most fashionable abstraction layers over plain HTML.
I'm not touching python like it was fire since it threw YouHaveAMissingTabSomewhereException at me 6 years ago.
And it always pisses me off to see a python source code all lower case. Someone would think upper case letters cost money and underscores are a must like a lemon in a corona.
Zero - The Number of Applications that Can only be Developed in Python 3
or
7456324 - The number of companies that only use Python 2.x
Although these made up titles are slightly tongue-in-cheek, they do server to illustrate that for me at least, I do not have a compelling reason to switch to Python 3.
> "That's also a meaningless argument. It's true of any turing-complete language."
Title 1 of course. Title 2 is a far more compelling reason to stay with Python 2.x. If I program for fun, then Python 3 is definitely worth playing around in, but for professional development, everything I do in Python is done in Python 2.x. That is not a theoretical argument, that is fact.
> for professional development, everything I do in Python is done in Python 2.x. That is not a theoretical argument, that is fact.
And everything I do professionally in Python is done in 3.x as of this January. Also fact! Industry standards sometimes move slowly, but they are moving.
That is only relevant if there is no significant professional Python 3.x development taking place. There is, in fact, significant 3.x development taking place, and that number is growing - exactly as one would expect.
And by significant, I mean upwards of 40%. So, except for corporate inertia, or the rare can't-live-without libraries that have yet to be ported to 3.x (see: http://py3readiness.org), there are no compelling reasons to stick with Python 2.x for new projects.
N.B. This is absent from Python 2 due to concerns with creating self-referential loops. The garbage collector got better in the meantime and the feature was never backported to Python 2.