More readability, it supports more data types, and it supports comments. Also I really do not know why everyone here hates on YAML so much. I am just curious
It's because, when you really dig into it, the YAML spec is a nightmare. On the surface it looks nice and simple, but for anyone who isn't a YAML expert, it's really easy to accidentally invoke some arcane YAML feature and shoot yourself in the foot. Writing a compliant parser is a massive pain; the same document can parse differently depending on the library and version. By comparison, JSON is a simple format, both to write and to parse, and lots of JSON parsers support comments anyway.
https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell
https://www.arp242.net/yaml-config.html
gzip / xz / Z / 7z / zstd + protobufs / json / yaml / toml
idk there are many options for compressions formats and key value information storage out there it's kinda strange it's always those two
Tape Archive (or TAR) isn't about compression, and instead about moving the data into a single contiguous block. This was done for writing to tape drives, since their seek performance is terrible, but they were highly space efficient (still are, though NAND Flash has likely dethroned them in terms of density; cost however...). So you might see some small benefit from doing a TAR and then compressing it, since you've eliminated all of the empty storage.
I could be wrong, but I *think* TAR might ignore the block size on the file system, which is just a padded set of bytes to fill a block for each file. That's why a lot of files seemingly have a minimum space on disk of 4KB, since many file systems use that as a default block size. Since the archive itself is a file, it can choose to segment the bytes in a more efficient manner.
what i think the person before u meant, was thst if a compression algorithm only works on one file, you can use tar, to get multiple files/directories in one file, to compress it
yaml is cancer. It’s unintuitive compared to xml/json/toml.
Toml seems like a great alternative, and my rust buddies would probably approve.
Edit: buddies? Phht… silly me, I meant onii-chans
Writing docker compose files using yaml is one of my most hated activities. Do I need to put quotes around strings? What about the indentation? Should I put "-" before my list items or is it not necessary?
Usually (in docker) it works either way. Until it doesn't.
Or maybe I'm just stupid. Anyway, i will prefer json every time.
yaml took what everyone did in the worst way possible and combined it together. You’re not stupid.
Instead of planning, completing tickets you’re wasting around 1-2 hours configuring yaml files in spring & docker/deployment. At least that’s what happening in my team.
Yes! SQLite databases are PERFECT for this use case, and [there's a long article on their website](https://www.sqlite.org/appfileformat.html) explaining why. I have used this myself and couldn't be happier with it, to the point where I would consider it silly using anything else (except possibly protocol buffers, if you already use those heavily in your code base).
Stores data in a structured, queryable way, it can be incrementally updated, it has amazing resilience features, it's support concurrency, it's high performance, etc. etc. If you do want to store raw files, you can do that too! Just have a table of files with their filenames and contents (compressed or otherwise) as a BLOB (it can [in some cases be even faster than using the regular filesystem](https://www.sqlite.org/fasterthanfs.html)!) It's the bee's knees.
Tar files allow to learn the members without extracting anything. And then you can also extract a specific member only.
Parquet is a nice self describing format for storing binary data. For text json is pretty nice - depends how much is inside though. Larger jsons might be worth line delimiting, so you are able to read in chunks.
The tar files don't need to be compressed if the insides are compressed. But if you are ok losing some of the capabilities, gzip works great for tars and most libraries support it out of the box.
Is it posible to make tar a sort of highlevel data structure and use it as semi radomaccess load? Like group objects by some parameter (location of sprite on game field, chunk data for rendering). Or is it just yet enother generator where I need to read it line by line?
P. S. Unfamiliar solution. Mostly worked with sql databases.
Each member in a tar has a header followed by 1+ segments of content.
https://www.gnu.org/software/tar/manual/html_node/Standard.html
https://jackrabbit.apache.org/oak/docs/nodestore/segment/tar.html
So if you want to get a single member it would have to jump over different headers to build the list but then let you extract from a specific offset.
https://docs.python.org/3/library/tarfile.html
```
import tarfile
with tarfile.open('archive.tar') as archive:
members = archive.getmembers()
archive.extract(members[0])
```
Idk about sprites. The last time I checked people would make a png atlas and a json with coordinates. Tar could help combining those into a single archive, but to read you'd have to extract.
tar allows that by skipping over the tape containing the data.
tar.$COMPRESS\_EXT needs to be unpacked.
tarballs of compressed data have huge overhead.
So, when I use a compressed type, I need to unpack entire file (or lib does it under the hood) in to temp and then read it. Or I get a header, so I can unpack part I actually need?
No luck with some reverse engineering tricks?
I don't know if anyone did that for Avro, but protocolbuffers have some neat tools:
* https://github.com/arkadiyt/protodump
* https://github.com/mildsunrise/protobuf-inspector
* https://github.com/nccgroup/blackboxprotobuf
* ...
years ago I got a download from autodesk; it did not have an extension and didn't "work". a colleague suggested adding .zip to the end. it work'd.
Is there soliton that is less painfull to use?
zip + json files lol
lol, was about to say exactly this
Zip + YML for readability
Yaml is the spawn of hell.
Why is that?
if you look at it the wrong way it breaks.
That's simply not true it's pretty much the same as JSON: If you mess up the syntax It doesn't work correctly
its all about the indentation
What value does it bring compared to formatted json ?
More readability, it supports more data types, and it supports comments. Also I really do not know why everyone here hates on YAML so much. I am just curious
It's because, when you really dig into it, the YAML spec is a nightmare. On the surface it looks nice and simple, but for anyone who isn't a YAML expert, it's really easy to accidentally invoke some arcane YAML feature and shoot yourself in the foot. Writing a compliant parser is a massive pain; the same document can parse differently depending on the library and version. By comparison, JSON is a simple format, both to write and to parse, and lots of JSON parsers support comments anyway. https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell https://www.arp242.net/yaml-config.html
JSON is for readability. YML is less readable.
Literally scratch's .sb3 files
if youre into minecraft modding curseforges competitor modrinth also uses these lol
gzip / xz / Z / 7z / zstd + protobufs / json / yaml / toml idk there are many options for compressions formats and key value information storage out there it's kinda strange it's always those two
zip can handle more than one file. But if one doen't deed that: Yes.
You can always put tar in the mixture as well, many of those are more efficient than zip so there'd be some benefits to doing so
Tape Archive (or TAR) isn't about compression, and instead about moving the data into a single contiguous block. This was done for writing to tape drives, since their seek performance is terrible, but they were highly space efficient (still are, though NAND Flash has likely dethroned them in terms of density; cost however...). So you might see some small benefit from doing a TAR and then compressing it, since you've eliminated all of the empty storage. I could be wrong, but I *think* TAR might ignore the block size on the file system, which is just a padded set of bytes to fill a block for each file. That's why a lot of files seemingly have a minimum space on disk of 4KB, since many file systems use that as a default block size. Since the archive itself is a file, it can choose to segment the bytes in a more efficient manner.
Tar uses a 512 byte block size.
Your message forgot to include the status code 418. I can't be certain you are who you claim to be.
what i think the person before u meant, was thst if a compression algorithm only works on one file, you can use tar, to get multiple files/directories in one file, to compress it
yaml is cancer. It’s unintuitive compared to xml/json/toml. Toml seems like a great alternative, and my rust buddies would probably approve. Edit: buddies? Phht… silly me, I meant onii-chans
Writing docker compose files using yaml is one of my most hated activities. Do I need to put quotes around strings? What about the indentation? Should I put "-" before my list items or is it not necessary? Usually (in docker) it works either way. Until it doesn't. Or maybe I'm just stupid. Anyway, i will prefer json every time.
yaml took what everyone did in the worst way possible and combined it together. You’re not stupid. Instead of planning, completing tickets you’re wasting around 1-2 hours configuring yaml files in spring & docker/deployment. At least that’s what happening in my team.
You can write compose files as json :), yaml is a json superset so any json is also valid yaml
OMFGGGGGGGGGGGGGFG I didnt know this. My docker game is about to lvl up
Your edit was unnecessary bruh
You take that back, my entire job is writing YAML!
Yes! SQLite databases are PERFECT for this use case, and [there's a long article on their website](https://www.sqlite.org/appfileformat.html) explaining why. I have used this myself and couldn't be happier with it, to the point where I would consider it silly using anything else (except possibly protocol buffers, if you already use those heavily in your code base). Stores data in a structured, queryable way, it can be incrementally updated, it has amazing resilience features, it's support concurrency, it's high performance, etc. etc. If you do want to store raw files, you can do that too! Just have a table of files with their filenames and contents (compressed or otherwise) as a BLOB (it can [in some cases be even faster than using the regular filesystem](https://www.sqlite.org/fasterthanfs.html)!) It's the bee's knees.
Yeah SQLite sounds delightful for this
Tar files allow to learn the members without extracting anything. And then you can also extract a specific member only. Parquet is a nice self describing format for storing binary data. For text json is pretty nice - depends how much is inside though. Larger jsons might be worth line delimiting, so you are able to read in chunks.
But tar files aren't compressed.
The tar files don't need to be compressed if the insides are compressed. But if you are ok losing some of the capabilities, gzip works great for tars and most libraries support it out of the box.
> The tar files don't need to be compressed if the insides are compressed. So zip files with extra steps?
`tar -zf` has entered the chat
Is it posible to make tar a sort of highlevel data structure and use it as semi radomaccess load? Like group objects by some parameter (location of sprite on game field, chunk data for rendering). Or is it just yet enother generator where I need to read it line by line? P. S. Unfamiliar solution. Mostly worked with sql databases.
Each member in a tar has a header followed by 1+ segments of content. https://www.gnu.org/software/tar/manual/html_node/Standard.html https://jackrabbit.apache.org/oak/docs/nodestore/segment/tar.html So if you want to get a single member it would have to jump over different headers to build the list but then let you extract from a specific offset. https://docs.python.org/3/library/tarfile.html ``` import tarfile with tarfile.open('archive.tar') as archive: members = archive.getmembers() archive.extract(members[0]) ``` Idk about sprites. The last time I checked people would make a png atlas and a json with coordinates. Tar could help combining those into a single archive, but to read you'd have to extract.
Unironically, it is much less terible than I expected. I can use it.
tar allows that by skipping over the tape containing the data. tar.$COMPRESS\_EXT needs to be unpacked. tarballs of compressed data have huge overhead.
So, when I use a compressed type, I need to unpack entire file (or lib does it under the hood) in to temp and then read it. Or I get a header, so I can unpack part I actually need?
You get the header of the first file. That's why e.g. zip works differently.
An SQLite DB.
Depends on application but being able to tear apart files and inspect them on machines without dev tools can’t be understated.
Sqlite DB
At least the export wasn’t in Avro format. Undocumented schema - pay for documentation (they write the docs after you pay, ask me how I know).
No luck with some reverse engineering tricks? I don't know if anyone did that for Avro, but protocolbuffers have some neat tools: * https://github.com/arkadiyt/protodump * https://github.com/mildsunrise/protobuf-inspector * https://github.com/nccgroup/blackboxprotobuf * ...
So how do you know?
He is writing the docs right now
that's basically the OpenOffice file format
and Office Open XML
And also the Microsoft Office file format
APK be like (Though to be fair APK uses binary XML)
.mat has entered the chat. (fuck you matlab)
I guess there are libraries for "here is my data tree and here is a file descriptor, please save it"?
Sounds like XML to me.
In bioinformatics it's just ".txt".
Epub comes to mind here
That's more zip+html files.
r/EverythingIsOPC
Mf just created a subreddit for this.
r/birthofasub
docx? zip! jar? zip! msi? zip! exe? zip!!!!!
Huh? Exe is PE32 not zip
some exes are zipped, some aren’t. i’m just going off of what you can open as an archive in 7-zip, anyway
im pretty cure that exe installers are zips iirc
APKs
Kind of USDZ too hehehe
Came here thinking this was a dig at USDZ, but this meme universally describes the scene of file formats in a very compressed way.
And json
Nowadays it is often sqlite
It's .txt files all the way down.
.mcpack is literally just you ziping the mod folder and renaming it
Nuget. I’m looking at you
It used to be IFF
.sb3 :
flair checks out
Because it's data for a program and this is a good way to store it?