"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearME

How much data does an Instance want to hold, and how much for specific communities

There are also issues lurking with accumulation of data. Moving to a batch processing system might want to consider that some instance operators may only wish to retain 60 days of fresh content vs. having every single history of content for a community for search engines and local-searching. The difference in performance is huge, which is why popular Lemmy servers have been crashing constantly - the amount of data in the tables entirely changes the performance characteristics.

Right now, Lemmy has no concept of tiered storage or absent content from replication or purge choices. Looking from the bottom-up, API client before touching PostgreSQL - a smart caching layer could even proxy to the API of a peer instance and offer a virtual copy (cached) of the data for a listing or post. Such a design could intelligently choose to do this for a small number of requests and avoid burdening PostgreSQL with the storage of a post from months or years ago that a few people take a recent interest in (or a search engine wants to pull a copy of old posts).

2
2
Comments 2