428
scaling to millions (iusearchlinux.fyi)
all 24 comments
sorted by: hot top controversial new old
[-] hobbsc@lemmy.sdf.org 61 points 1 year ago

There's a really fine line between needing a spreadsheet and needing a database and I've not yet found it. It's probably more fuzzy than I realized but I have participated on so many programming projects that amounted to a spreadsheet that lived too long.

[-] GenderNeutralBro@lemmy.sdf.org 58 points 1 year ago

Does it need to be accessed by multiple people? Does it need to be updated frequently? Does it need to be accessed programmatically? Does performance matter? If you answered "yes" to any of these questions, you should probably use a database.

If it's something you interact with manually, has less than 100,000 rows, and is mostly static, then sure, use a spreadsheet.

I used to have some scripts to convert and merge between CSV and sqlite3. Even a lightweight db like sqlite3 has value. Google Sheets fills some of the gaps with its QUERY statement but I still find it a bit awkward to use a lot of the time.

[-] DoomBot5@lemmy.world 12 points 1 year ago

Google sheets works just fine for accessed by multiple people.

The line is probably somewhere on machine vs human readable.

[-] 4am@lemm.ee 7 points 1 year ago* (last edited 1 year ago)

Performance. If you get 30k transactions per second don’t even SAY spreadsheet lol

[-] DoomBot5@lemmy.world 12 points 1 year ago

Per second? If you get that many per day, I wouldn't touch it with a spreadsheet.

[-] Uprise42@artemis.camp 8 points 1 year ago

I can answer yes to all of these questions but still use a spreadsheet. I understand your point, but I feel even with these the line is still gray.

I just checked and my largest spreadsheet currently has 14,300 lines across 12 tabs. Each tab contains the same information just pulled from a separate form. So each tab is linked to a form to update automatically when someone submits a new response. We then process these responses periodically throughout the day. Finished responses are color coded so a ton of formatting. Also 7+ people interacting with it daily.

Then we have a data team that aggregates the information weekly through a script that sends them a .csv with the aggregate data.

The spreadsheet (and subsequent forms) are updated twice a year. It was updated in June and will be updated again in December. It’s at 14k now and will continue to grow. We’ve been doing this through a few iterations and do not run into performance issues.

[-] peter@feddit.uk 13 points 1 year ago

At some point you end up surpassing databases and end up with a giant pile of spreadsheets called a data warehouse

[-] MonkderZweite@feddit.ch 4 points 1 year ago* (last edited 1 year ago)

As soon as you stop data maintenance per hand, start using a db.

[-] phorq@lemmy.ml 45 points 1 year ago

The article doesn't seem to say what type of database they moved to, I'd like to imagine it's an excel spreadsheet...

[-] ForgotAboutDre@lemmy.world 14 points 1 year ago

The problem with the spreadsheet was rate limiting by Google. I like to imagine the have the spreadsheet copy and pasted. Then split the requests to two different spreadsheets, doubling the amount of requests they can do.

[-] bdonvr@thelemmy.club 13 points 1 year ago

Dear god, parallelized excel spreadsheet databases!

[-] tourist@lemmy.world 21 points 1 year ago

SAP S/4 HANA is not mental illness

It's worse

Your physical health and everyone you love will suffer too

[-] nomecks@lemmy.world 11 points 1 year ago* (last edited 1 year ago)

Sandhur Anziege Programm, or "Hourglass Displaying Program" in English, first started out as a hardware stress tester. It only made sense that it would evolve into a human stress testing program from there.

[-] Black616Angel@feddit.de 2 points 1 year ago

I don't wanna give them bad ideas, but the only logical next step is to have 2TB of CPU cache.

[-] cupcakezealot@lemmy.blahaj.zone 14 points 1 year ago

All the cool kids use Microsoft Access

[-] Awkwardparticle@artemis.camp 13 points 1 year ago

Not going to lie. I have made it really far using Google sheets as a database until I had to move to a full DB. Google Sheets is just an interface to a Google run DB anyways. A lot of the time I would just load sheets(tables) into data structures from a SQL library and use it like a regular database.

[-] randomTingler@lemmy.world 10 points 1 year ago

I use Google sheets on my personal projects.

  • tracking expenses
  • tracking my ride logs from my ev

Google app script pickup email received from banks and update the transaction automatically. I modify it using web based form of needed.

The ev company has api. It will be called on periodic basis to get the ride details and update the sheets. I use telegram api to interact, by triggering the api webhook, getting charts, etc.,

Initially I setup OCR to extract the ride information, so the process was like

  • send an image that has ride details to a telegram bot
  • the bot saves the image in Google drive
  • the image would be opened in docs
  • the text would get extracted and stored in Google sheets as data
  • edits can be done using telegram webapp
  • sheets provide charts for analysis
  • built a process using sheets to handle telegram bot for maintaining users, approvals, etc.,

It would be hard if I have used any other services.. setting up OCR preparing charts etc.,

[-] MonsiuerPatEBrown@reddthat.com 5 points 1 year ago

thank god postgres is still safe!

[-] xmunk@sh.itjust.works 5 points 1 year ago

Nobody is dumb enough to insult postgres - we'll fucking burn you at a stake for heresy like that.

[-] Cwilliams@beehaw.org 1 points 1 year ago
this post was submitted on 27 Sep 2023
428 points (96.7% liked)

Programmer Humor

32476 readers
845 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 5 years ago
MODERATORS