Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
10,000TB storage cartridges could become mainstream by 2030 (techradar.com)
65 points by Brajeshwar on Dec 2, 2023 | hide | past | favorite | 81 comments


This article is really bad: "making slow hard drives and tapes obsolete" and "In contrast to data usually stored on the best hard drives and the best SSDs of today..."

No. My understanding from the video is that it's a write-once solution, so if anything it would just be a replacement for tape backup, so not sure why it's comparing it to hard drives and SSDs.


Many data centers use spinning disk hard drives in a write-once configuration. This is fine for some use patterns and lets them take advantage of SMR and higher density. Dropbox has blogged about this: https://dropbox.tech/infrastructure/extending-magic-pocket-i...

My understanding is that every drive is write-once and once full it doesn’t change. Eventually, if most of the file data/blocks stored on a drive are deleted by users, the whole drive is erased and re-written.

That said, while being write-once only isn’t a blocker, my guess is seek latency will still prevent this tech from replacing spinning disks.


It does change due to deletes. Otherwise, it would be obscenely expensive to never delete data. Garbage collection is hard but since it’s async it’s also possible to do efficiently while on SMR.


At this scale, does it really matter? Right now everything I record in the cloud has an infinite history attached to it, so nothing is ever really "deleted" as I understand it, the index merely points to the latest version of a file.

If the deletion is meant as a security feature, then you can "delete" a file by making it encrypted and deleting it's decryption key.


Most clouds these days do have a limit on maximum version retention. Otherwise you'd effectively get infinite storage for free.

If a cloud storage provider doesn't charge for infinite version, that's not very different from offering "unlimited storage" plans: They have pretty good statistics of how much the average user really uses on an "unlimited" plan and will price their services accordingly. Revision history just adds another term to the same equation.


Agreed with this point! At this scale more aggressive garbage collection doesn't save you much actual money.

In the early days at one such provider I know that garbage collecting deleted user files was considered a relatively low priority cost-wise.


The more data you have, the more savings. Savings at 1 exabyte are way different than savings at 10, 20, 50 exabytes. This ends up being a lot of storage racks. File revisions is a minimal part of the story.


> My understanding from the video is that it's a write-once solution,

If you have enough capacity, write-once is enough. Indeed, MLC SSDs can often be rewritten a couple hundred times. Consumer HDDs are specified for 180TB written per year.

Would you rather have:

* A HDD that can store 22TB and can be written 180TB per year for 5 years until the helium leaks out (900TB total writes).

* A 10,000 TB storage cartridge that can initially store 10,000TB, but as you "erase" stuff the capacity falls.

The latter can tolerate 1,000 TB "erased" per year for 5 years and still store 5,000TB at the end.


The title could use some changes in wording, but "In contrast to data usually stored on the best hard drives and the best SSDs of today..." was clearly referring to the data retention time of those media, and that is indeed often quoted as being on the order of years, not thousands of years, for HDD/SSD data carriers. I'd say that's quite contrasting indeed, if that ceramic storage can keep its promise of 5000 years data retention.


> was clearly referring to the data retention time of those media, and that is indeed often quoted as being on the order of years, not thousands of years

Correct, but the primary reason those technologies have shorter retention times is because they are rewritable. It's an apples-to-oranges comparison that doesn't make any sense. It would make sense if the article was clear that this was about improving archival storage options, and then comparing to things like CDs/DVDs, tape, and film.


They have shorter retention times not because they are rewritable, but because:

* HDDs fail, or these days helium leaks out at end of lifespan.

* SSDs slowly have the data leak out of floating cells.


I always wonder about what we will do for reading. It's not easy to read back a few decade old tape even if the tape is pristine. You would really have to scavenge for a working tape drive. Multiply that timeframe and I really don't see how having it being stored for thousands of years as that appealing.


Not only that, but the article describes a system that is in a remote data center, not something I can have in my laptop. Although if it fit in my laptop for the same price as a hard drive. I could probably 'struggle along' with a write-once solution and not worry about filling 10,000 Terabytes any time soon. Complete history on every file!


Exactly. I guess "making CDs, DVDs and Blu-rays obsolete" doesn't have the same ring to it.


I am excited by the durability. We need an archival storage mechanism that can last for long time. Hopefully, this will be cheap and come in smaller size for home use.

For example, backing up photos is annoying. Hard drives are unreliable, cloud storage can go away, and tape is expensive.


Use cloud storage the unorthodox way, like I do. Make multiple accounts, then create a big file with your backup data (pictures, projects, etc) and write it in a VeraCrypt file. Then split this file in multiple files and upload each to accounts (I use protonmail, yahoo, gmail - a few different vendors in case one goes belly up I can still access the backup from the others).

It's a hassle at beginning to create the accounts but you can create a program that has an embedded browser (I use chromium) and then you can automate the tasks. Basically you just simulate keyboard and mouse over the chromium surface and the vendors are none the wise. I keep my info of each account into a local SQLite DB and the program does the automation, requiring minimal interaction from me. So the only real backup I need is that SQLite file. Rest is scattered around in the cloud with none having access to it but me.


If you need cloud storage for archival, consider Amazon S3 Glacier Deep Archive. $1/TB/month and S3 is probably the last storage service on the Internet to "go away".

Retrieval costs are pretty awful but helluva lot better than losing everything.


S3 may be last storage service on the internet to go away, but if you stop paying for a while (forget, credit card expires, medical issues), then your data will dissappear.


Why do people pay for things like Google Drive or Dropbox when S3 storage is so cheap?


Why use Dropbox when you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem, back in 2007 even?


I understand some of these words.


It's a reference to the comments of the "Show HN" of Dropbox back in 2007: https://news.ycombinator.com/item?id=8863


Because I can't store my gmail and Google Docs documents on S3, and because Google Photos is easier to use than cramming my stuff to S3.


Yup. It's not just that I use Docs, but I don't know what I'd do without Google Drive Search across all my documents. Same with searching photos using text search.

Also Google Drive for Desktop. I'm sure there's an equivalent for S3 but I don't know if it works as well.


Every backup restore test will also be quite expensive with glacier


Similar idea to Microsoft's Project Silica[0], though this is claiming much higher data density.

Were I Google or AWS, I would buy them out immediately. If only to deny the other from getting potentially cheaper archival storage. Or even to keep the rest of the market reliant upon tapes. Want long term data storage? Only the cloud has the real technology for that.

[0] https://www.microsoft.com/en-us/research/project/project-sil...


There is a big difference between managed archives and the media itself. Services like Iron Mountain predate the modern Cloud.


> This process imprints holes – or no holes – onto the surface layer, which represents binary information.

It’s high-density ceramic punch cards!


Well, that's how CDs and other optical media also work.


Next: Cerabyte goes down the drain, by going bankrupt, simply dissipates into thin air or will be acquired, then we'll never hear about them again.

I vividly remember multiple archiving products announced, all disappeared, somehow. Same with "groundbreaking" battery solutions.


This is basically optical write-only tape drive technology. There is literally no way to get to TB/cm2 using their technique (that's ~100 bits/linear um) using visible range light and spatial light modulators in a single layer.

They reveal that their future system will use 5 um thick tape (good luck! that's half the thickness of Saran Wrap) so I assume the claimed TB/cm2 density includes a bunch of layers. That's fine, but the claims about 5000 year lifetime on 5 um polymer tapes with nm thick ceramic are pretty extraordinary (requiring extraordinary evidence.)


It's about a year to the outer planets on falcon heavy. Less on SLS, but I digress.

Suppose a flyby mission records a couple of these cartidges and flies back to Earth. Assume 3 years round trip (1 there, 1 record, 1 back).

That's Nx80,000,000,000,000 bits in 3 years.

Or Nx10E+16 bits / 9E+7 seconds.

Or about a gigabit per second effective rate per cartridge.

Not bad. Psyche (our best optical link from deep space), has 2 megabits/s peak rate.


What's so difficult about optical links from deep space compared to low earth orbit, where 200 gigabit throughput has been achieved? Is it just the attenuation?

I would have imagined that we could upgrade the communication equipment on a space probe much more easily than we could add fuel for a return trip.


> What's so difficult about optical links from deep space compared to low earth orbit, where 200 gigabit throughput has been achieved? Is it just the attenuation?

Yup; 300km vs 600000000km. Less than a trillionth of the power comes through for the same emitted power and apertures.

So, you end up having to make apertures much larger and point much, much more precisely. You can also increase laser power some, but that's a small part of your solution.

(Or, of course, you can reduce speeds to have more energy per bit).

I don't think anyone is serious about shipping back physical data from deep space, but the station wagons full of tape thought experiment is always fun.


I am semi-serious. But it's more about a periodic backup of an on site science station.

An on site (orbital) station could run all the models and algorithms and analysis you want on the gobs of imagery and spectral maps / cubes you can gather in real time.

Analysis is an excellent compression algorithm. But, you'd want the raw data eventually, and that means either trickling back 0.1%, or waiting a few years for the full set.

https://josh.vanderhook.info/publications.html#josr2022sspe


Hmm

What’s your sense about SETI or the Fermi paradox, if a signal becomes so vastly diluted just within our solar system?

I’m sure the SETI people have thought about this and made various calculations, but with the inverse square law and the vastness of space, maybe “needle in a haystack” is optimistic.

Is it the wrong model to think that anything but maybe a galaxy scale civilization is just going to have it’s signals more or less totally dissolved into seemingly random cosmic fluctuations, relative to our sensors/receivers at least?


Maybe. Recovering data on interstellar distances is hard.

Integrating a long time to see if there's a signal there above background levels is maybe not so hard (especially if it was intended for detection in this scenario).

The big issue for data recovery is energy per symbol. If you can integrate for hours, that can still be a lot of "special photons" (whether they're on a weird radio frequency or light wavelength).


I would argue that the time spent traveling there doesn't count, because you wouldn't get any interesting data during that time from a probe using radio.

Of course, this isn't realistic, but the "station wagon full of tapes" thing is always fun.


Startup claims to have storage technology so groundbreaking that it can reach TBs/cm^2, GBps speed (read/write?), seconds (?) of read latency, low cost of <$1/TB, and durability of thousands of years. Even attaches a 2030 mass market timeline to it, which is not far into the future. I'm skeptical, and the 2030 number makes me even more skeptical.

I didn't watch the hour-long presentations though.

https://storagedeveloper.org/events/agenda/session/527

https://storagedeveloper.org/events/agenda/session/603



DNA "storage" is great for parallel interrogation I've read, but for actual "storage" storage proteins are far longer lasting. With fossilized proteins having been recovered after over half a billion years. There's some loss in fidelity as the protein side chains undergo some chemical processes, but these processes are known and can be compensated for.


DNA has been extracted from Neanderthal bones more than 50.000 years old. DNA is very durable, for all practical purposes. I've heard some DNA synthesis vendors like Twist Biosciences will offer it commercially next year.


Storage conditions really matter for DNA longevity, and a lot of the older DNA that can be extracted and sequenced is both fragmented and damaged in areas. I've had closed-circular plasmids fragment completely after being stored in water at 4C for about a year or so.


i wonder if we’ll ever get to a point where we don’t think about storage at all. or will we always fill up whatever storage that’s available.


For a lot of applications we have pretty much reached that point already. 2TB NVMe SSD are around the $100-$150 price point these days. Unless they are actively trying, the average desktop user is never going to fill that up. There are only so many holiday pictures you can take, after all.

I think the size of audio files is a great example of why storage needs won't infinitely grow. Although we have orders of magnitude more storage space these days, audio files haven't really gotten any bigger since the CD era. If anything, they have gotten smaller as better compression algorithms were invented.

The thing is, human hearing only has so much resolution. Sure, we could be sampling audio at 64-bits with a 1MHz sample rate these days if we wanted to, but there's just no reason to. Similarly with pictures: if you can't see any pixels standing a foot away from a poster-sized print, why bother increasing the resolution any more?

The big consumer-range data hogs are 1) video torrents, and 2) games. Both of them have a natural upper bound due to human perception. They might still grow by an order of magnitude or two, but it won't be much more before it just becomes pointless.

Enterprise is a bit of a different story, of course - especially now that AI is rapidly increasing the value of data.


I don't think holding human perception as a standard for video games holds up. For audio and video it makes sense, but we are a long way of for video.

For video games there is a huge "storage waste" factor. A factor that I think is more important than the human perception limit. If you look at modern games you can probably throw away double digit percentage signs of most games if a capable team would have the time to optimize for disk space. It's simply not done because it has little advantage. I think this wastage factor will scale with the complexity of video games regardless of graphical fidelity.


> 2TB NVMe SSD are around the $100-$150 price point these days. Unless they are actively trying, the average desktop user is never going to fill that up. There are only so many holiday pictures you can take, after all.

4k family videos would like to have a word with you.


Big brother potential aside, I could imagine a future in which everyone wears a body camera everywhere. A digital record of your entire life. Given increasingly good AI extraction, you can then have searchable transcripts of your conversations, query your life for the last time you saw movie X (interesting IP questions here), re-experience moments with grandma, whatever. There was a Black Mirror episode incorporating this concept.

4k video * lifetime is greater storage requirements than available to consumers today.


If you find such a concept interesting, I would recommend the movie Final Cut (2004) with Robin Williams. Not a particularly good movie but it does have this as an interesting premise.


resolution beyond human perception is useful, for watermarking, data tracking, EXIF, DRM schemes, in/ex filtration in general.


> There are only so many holiday pictures you can take, after all.

You miss the point, sadly.

Yeah storage is getting larger and cheaper but modern cameras/phones/etc take photos that are way larger than they used to do.

My first digital camera in like 2001 or so could only take either 20 "large" (640x480) bitmap photos or 80 "small" pictures, taking a few tens of kilobytes each at most.

My 2022 iphone se takes high quality pictures that are easily in the 7-8MB range (i just checked).

So yeah, disks keep getting larger, but so does the media.


And image sizes aren't really bound by human perception in any real way. Yes, screens typically range from 2-8MP, and probably won't go much beyond 32MP. But more resolution, more dynamic range, more color fidelity are incredibly useful for editing pictures; and you can always add resolution to add digital zoom.

Photos are however constrained by physics. There are only so many photons captured in a certain area in a given time, so no matter whether we talk about the tiny lenses and sensors of smartphones or the much larger versions on dedicated cameras there's a very real limit (and phones are pretty close to the limits imposed by their sensor size)


I spend so much storage on just storing things that are readily available on the internet, either to speed up local access or because one day they might not be as readily available. I don't think we will "solve" storage for good unless we also "solve" bandwidth.


I think we will always fill up the storage available as we crank up the fidelity of our recordings (simulations). Video is an approximation of reality, and we can always ramp up the resolution, ramp up the dimensions, always striving to simulate the universe at the same fidelity as the universe itself, to the point that the map becomes the territory itself. In other words, until it's no longer approximating, no longer compressing. There will never be enough storage to store our simulations of our reality as long as we're forced to use that same reality to construct the storage.


I'm already at that point, effectively. I spent a few decades worrying about fitting data into hard drives... most recently with my digital photos. I've stopped worrying.

My first computer stored data on audio cassette tapes. When I started college, the new hot machine in the server room was a DEC VAX 11/780 running VMS.

For nostalgia purposes, I have a virtual VAX 11/780 running VMS in my phone, it only takes about 3 GB. I don't do much else with the phone, so I have plenty of room left over.

I can imagine ways to fill up a petabyte personally... but not much past that.


Only if we reach a peak fidelity level. Since storage will never be infinite therefore unless we reach a level of media quality that is good enough for everything it will always be an arms race.


I think even if we got storage that could store enough information to be indistinguishable from reality itself, we would still want to save variations, clips, duplicates, intermediates... I don't know that there is a peak fidelity.


We are already doing that, though.

The thing is, there are only 24 hours in a day. There is a hard upper limit to the amount of content you can consume. You're not going to be downloading 1000 hours worth of content every single day, just because it is physically impossible to use it.


Tell that to /r/datahoarders. Always amazed by what people are archive in that sub reddit


My 500 GB HDD has been way more than adequate for the past 10+ years, and that includes having a Windows 8 partition that I haven't booted into in years.


I've seen data on floppy disks from the early 1990s a few weeks ago, on an Apple ][. Magnetic media lasted way longer than I expected.

But, in this case... I have a hard time believing in a "Thousand Year Write" ;-)

There's no way that something that thin and ceramic isn't going to somehow become embrittled, or fracture, in 100 or more years.


> This first demonstration unit doesn't compete with the very best data storage units out there, but the company plans to scale up its ceramics-based data storage system

This begs the question: on which performance measures does it fail to compete? Supposedly not on capacity.

Read speed? Write speed? Power use? Cost? Sadly they don't say.


You are misusing the term "begging the question", which is a formal logical fallacy and not synonymous with "raising the question".

https://en.wikipedia.org/wiki/Begging_the_question


I think one important reason people misuse this so much is because "begging the question" does not straightforwardly convey the meaning assigned to it by whichever logicians coined the modern English term. It's a linguistically crappy term. I'd personally prefer that "beg the question" be repurposed in the way GP used it, and that another term be used to describe the logician's idea.


This one’s lost. Carry on using it correctly among those who get it, avoiding the expression among those who probably don’t, and quietly accepting this usage from those who employ it (while, perhaps, marking them off your mental list of potential proof-readers).

I don’t even feel too bad about this one, because it’s so easily misunderstood. May as well let it go.


I don't think it's lost at all. One mindful correction may lead the recipient to a lifetime of correct usage (and perhaps correcting others). The viral coefficient seems favorable.


It is not incorrect to use this expression, it can universally be understood to mean what the author meant through context, whereas the usage you consider correct is exotic.

https://www.merriam-webster.com/grammar/beg-the-question


You're getting downvoted, but thanks for posting this link. The blog post by Stan Carey (excerpted below) mentioned in it was quite interesting.

> Beg the question first appeared in English in a 1581 text of Aristotle’s Prior Analytics, and this translation has had semantic ripples down the centuries. The phrase is opaque because its use of beg is really not a good fit – it’s no wonder people have interpreted it ‘wrongly’. Had the original English translation been assume the conclusion or take the conclusion for granted instead of beg the question, there would be far less uncertainty and vexation.

> 269 out of 300 examples of begs the question used it to mean raises the question, more or less. That’s 90%. This figure show its huge predominance in contemporary discourse. Outside of formal debates and philosophical or semi-philosophical contexts, the traditional meaning of beg the question is hardly ever used. The evade the question use is rarer still.

> This is why insisting on the original use, as prescriptivists do, risks confusing many readers. It’s not a practical or constructive stance. Correctness changes with sufficient usage, yet sticklers still refuse to accept there can be more than one way to use this phrase. By adopting the tenets of one phrase → one meaning and original meaning = true meaning, they have painted themselves into a corner.

> The expression is ‘skunked’, to use Garner’s term. Grammarphobia agrees that it’s ‘virtually useless’, and Mark Liberman recommends avoiding it altogether. In formal use I advise caution for this reason, but in everyday use you’ll encounter little or no difficulty or criticism with the raise the question usage.

Additionally the LangLog link shows that "begging the question" is a result of badly translating a less than perfect translation. Greek to Latin and Latin to English.

> Some medieval translator (does anyone know who?) decided to translate Aristotle's "assuming the conclusion" into petitio principii. In classical Latin, petitio meant "an attack, a blow; a requesting, beseeching; a request, petition". But in post-classical Latin petitio was also used to mean "a postulate"

> Why begging the question? Well, petitio (from peto) in this context means "assuming" or "postulating", but it has other (and older) meanings, from which the notion of logical postulate or assumption arose: "requesting, beseeching". So rather than use some fancy Latinate term like postulate or assume, people decided to use the plain English word beg[ging] as a sort of calque for the "requesting" sense of petitio. But even in the 16th century, I think, it was a bit odd to warn people against presupposing the end-point of their argument by telling them not to beg their conclusion.


Language is defined by it's usage.

If enough people make one particular "mistake", eventually that mistake becomes the "correct" usage. See "literally" as an example:

https://www.merriam-webster.com/grammar/misuse-of-literally


or your unnecessary apostrophe in it's


I was using it in this sense

> The phrase "begs the question" is also commonly used in an entirely unrelated way to mean "prompts a question" or "raises a question", although such usage is sometimes disputed.[4]

but will try to refrain from doing so in future...


everyone misuses the term which raises questions about prescriptivism vs descriptivism.


You're begging the question with this dichotomy. It likely originally arose as a descriptive term among those who had the requisite knowledge base prescribed to them.

:P


I interpreted that to mean that the demonstration unit is some kind of prototype that significantly differs from what they are planning to build, i.e. they've shown that it works but it's not in its final form.


2 million laser beamlets sounds very expensive, depending on how they generate them.

It could be that the "writer" costs $1 mil right now, so might only make sense in a data center.


What's a good write once backup solution for long term storage _today_ though?

I saw writable bluray disks recommended as they would last for decades, however 25GB for a standard disk is by far not enough for a photo collection, and many disks gets clunky.


See my other comment above about me using the cloud in an unorthodox way.


I wonder what we will find to store in all that space? Stuff that shouldn't be stored, probably! https://news.ycombinator.com/item?id=38498090


Maintaining the beam quality of 2000 laser beamlets ain't cheap.


Bollocks. Sensationalist fluff. Will never happen, by never meaning "not during our lives".

I'm willing to bet you on that, it's a lock. Easiest money possible.


Yep, we have been promised crazy storage like this for years now and it has always been vaporware. Remember this? https://www.engadget.com/2007-03-28-mempile-shows-off-teradi...

One can dream though....




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: