I predict a problem:
Currently mastodon is being run as free software by hobbyist and enthusiasts. They're piggybacking on infrastructure built for much larger projects and customers. You don't build a server hall for a thousand low-tier payers of $15 / month. You build it for the two of those that will need to scale up to the enterprise level, and eventually (now) fedi software will have to scale to that same tier
Hi there. I'm fairly new to mastodon but know a bit about web infrastructure.
Making sure projects are financially stable (and stable in terms of capacity) is very important to be concerned about.
Would be interested in hearing your thoughts on a company like 1984? ( https://1984.hosting/about/ )
So far, though, over the years, when it comes to web hosting, the trend has always been towards allowing people to do more for less money.
If this does change though, in the way that you say, the economy as a whole will likely be pretty fucked, I imagine.
@benx Hi! Sorry for just going to bed straight after posting! 🌳
Welcome to Mastodon! I'm going to set this to public just to keep going on my own thread.
What I want to emphasize is that currently small projects can piggyback off of the infrastructure that is required for bigger projects hosted on cloud providers. There are two reasons why things are cheap for small project at the moment, and there are two strategies that can be taken. Both are bad.
@benx Let's do the reasons for cheapness at the current first.
There is a fine balance between ludicrous profit margins and destitution for companies in narrow industries. Cloud infrastructure is one of these institutions.
What has tipped the balance over the last decades has been the culture of open-sourcing of infrastructure tooling, and Moore's Law. Because of the fact that the servers run open source software, there is little startup cost for software for a competing cloud provider.
@benx What is needed however, is human capital in terms of educated workers for the management, and hardware capital in the terms of CPU's, RAM, Storage, and server hall equipment. These aren't cheap, but they're not mainframe, rocket-science level hard to pull off. As a provider you can also start small and ramp up gradually as your customer base grows.
Now for the reasons why it's been smart for these providers to provide low-cost hosting to people looking to run smaller projects.
@benx 1) The half-life of existing capital invested into CPU, Storage and RAM has been short. This is Moore's law in action and it has tipped the balance in favor of new competitors, because they can recruit the educated workers of the previous cycle, thus not paying for training them, while at the same time getting more performant hardware per dollar than their older competitors. The only thing that matters long term is customer loyalty, and recruitment of the next big thing.
@benx 2) Because of the fact that recruitment of new projects is the lifeblood of cloud providers there's been an enormous need to make it easy to provision new environments for new tenants, because the one that provides the easiest and cheapest setup has the highest chance of getting the golden egg. The one that has the most startups on its customer lists has the highest chance of getting to host the next TikTok, and that is where the big bucks lie.
It is a marketing strategy.
@benx Now, #fedi is paradigmatically different than what the providers are used to. For one, instances do not need to grow their own user bases to be viable because of the fact that instances are interconnected. The fediverse breaks the upgrade cycle that the cloud providers want because we have no interest in growing.
Secondly, instances do not have an inherent revenue stream which scales with activity, so if computing were to become scarce we cannot compete against better paying customers.
@benx Computing has historically been cheap, but that is because profits could be extracted from technological breakthroughs in developing better CPUs. And now Moore's law is petering out, which means that in the near future the incentive to produce more and better CPUs will decrease, and capital will try to find investment elsewhere.
But the cloud providers still want a return on their now much more long-lived hardware investments, so they will have to squeeze the market.
@benx Instances have, as I see it, two options here. We either stay small, or try to grow. Both are problematic.
Many Smaller Instances: The problem here is that if cloud providers find that the amount of smaller long-term instances on their lowest tiers doesn't cut it for their profit goals, they will change the contracts. This is unsustainable for instances that live on donations.
I also have my doubts that smaller instances can handle the mass of people now fleeing Twitter.
@benx So some big instances are required, but they're going to get the other end of the stick if they keep hosting contracts with arms-length providers. Those higher tiers cost more than a hobbyist can afford and I have a feeling that as an instance grows in users, the $-amount / user in donations decrease. Gains from scale exists, but ratcheting of price will still happen.
I am just opposed to the idea of wealth transfers just for the sake of being able to maintain our own public squares.
@benx So, to me, the only solution out of this squeeze, which again is just inequality and private property by another name, is to #ownthemeansofcomputing by which I mean co-op server providers that run at break-even for the benefit of the community.
We need commonly owned infrastructure (just as with roads and bridges) or otherwise capital holders will start putting up toll-booths.
Anyways, thanks for coming to my TED-talk. Interested in what @forestjohnson would think of this 🌿
Hey, so, like I was saying I do have some familiarity with web server hosting already.
This is because I helped run a hosting provider for over a decade, and throughout that time we made an effort to try and make the hosting more secure over and have more control over the resources.
I wasn't entirely clear about what you meant by rent for infrastructure because there are different levels of infrastructure (1/?)
Web architects is one co-op, I know of, set up by people from the political activist scene. I don't what their current setup is but I know they have maintained their own physical servers in the past. https://www.webarchitects.co.uk/
Security wise the best place to host servers is a country with good data protection laws which is why I mentioned 1984. Web architects has provided hosting on 1984 as a more data secure option. (2/?)
There are many other activist run hosting collectives that are out there, who will have different structures and different levels of infrastructure control. I will try and find a list later and post it here.
I don't know if you are familiar with riseup.net
Riseup.net mention that they run their own colocation (where web servers are physically run from).
They have a list of other similar projects on their website. I'm not sure how up to date it is tho.
I'm sure there are many other similar projects not listed there, but that list may be a good place to start and those groups will likely know of other orgs/co-ops doing similar.
If you are more mainly worried about cost, though, you may find that is probably cheaper to utilise the large cloud services that are available.
And then, if needs be, shifting to a different server, whether that be for cost or security reasons.
If you don't trust your web host from a data protection point of view but it's the only affordable option for what you need then I would try to set your system up to encrypt as much as you can. That itself is a whole other thing.
@seedlingattempt I don't know... I don't think that a lot of the stuff you are talking about can come to pass.
Yes these hosting providers are driven by greed... But as far as we know, there's no monopoly, there's no syndicate. They **compete** with each-other.
Also, keep in mind that in many ways public clouds are sort of like a utility. Like water or electricity. The stuff they sell is fungible & you can purchase different amounts of it, generally the price per unit stays the same-ish for a given provider. In fact I would say it gets CHEAPER per unit as you buy in bulk, not more expensive.
IMO, this competition in a market for a water-like commodity means that we'll always be able to buy some if we want. The price isn't going to skyrocket. I don't see either supply drying up or demand exploding any time soon.
I worked in the enterprise software world for 5 years, for the last 1.5 years of that I worked as a DevOps specialist / SRE for a company that spent almost a million dollars a year on AWS EC2 instances and similar...
I'm extremely familiar with scaling software, the type of problems that come up at scale, and how that translates to the economics of the situation. In my opinion you are missing the most important aspect of the scale question:
**COMPUTERS ARE LIKE, EXTREMELY EXTREMELY FAST**
A computer can easily do a million things per second without breaking a sweat. Yes, even over a network, yes, even with on-disk persistence and each event being validated.
Computer science teaches us to ignore the "constant factors" (each event taking 5 microseconds to process versus each event taking 500 microseconds to process) and instead place a laser-like focus on the __Growth Rates__ of the CPU time and memory requirements as the scale of the problem grows.
In my experience at work, both things end up mattering, but if you don't get the growth rate stuff under control first, any optimizations that can be made won't change the overall picture much.
The problem here is that all too often, the "growth rate" of the CPU time, etc, the "Big O Notation" of your program or network, doesn't depend on what language its written in, it doesn't depend on what hardware it runs on or how fancy the network is. It purely depends on the DESIGN. The interface design. API design. How the parts fit together and move together.
All too often, software is designed quite well for one thing but ends up being used completely differently -- or it's just designed poorly. There is not always an upgrade path from poor design, especially with a networked community of servers like mastodon / ActivityPub. I don't know enough about ActivityPub myself to comment on how its design affects its ability to scale, but I do feel confident to say:
The API design of ANY software will affect its ability to scale 100x more than any economic issues like datacenter costs.
Those Enterprise cloud customers that pay $1M/year to AWS aren't paying that much "just because"... they are paying that much because its cheaper than trying to re-architect their system with a better design. They probably pay over $10M/year in salaries and benefits... It's simply a lot cheaper to hire a few hundred virtual machines to run inefficient code than it is to hire a team of professionals to figure out an upgrade path away from said inefficient code.
New ActivityPub servers like GotoSocial promise to explore the limits of ActivityPub optimization -- if ActivityPub's design allows for it, I predict that once properly optimized, a GotoSocial instance will be able to handle hundreds of users and thousands of federated connections WITHOUT needing a hardware upgrade. I predict Bandwidth will actually be more expensive than the computation side of things!!
That's just my 2c.
Honestly I would worry more about legal issues and regulations making it harder for the little guy to get access to public clouds. That's the only thing I can think of that would force people into livingroom servers.
But IMO it makes no sense for the government to do something like that, if they already have their 3rd party doctrine and everyone and their brother is happily occupying the public cloud panopticons, why upset the apple cart? Gov't would lose an incredibly powerful surveillance weapon by kicking the grassroots out of the public cloud.
Jokes aside, I get your point about it compute being a utility much like electricity itself or even water. But your comment about bandwidth possibly being scarce is a little alarming
So my thinking is probably off by a couple of magnitudes and I need to reassess my assumptions. This ain't a problem unless artificial scarcity happens
Yeah I don't actually know that much about the market for bandwidth at the commercial scale, I just know that its the most common thing that you get nickle-and-dimed for on the public cloud. It could be a little bit of a "gentlemans agreement" among the public clouds that they all get to overcharge for it.
But I think some of them don't, like hetzner for example offers 20x cheaper bandwidth compared to DigitalOcean.
In terms of real numbers, @f0x the owner of the mastodon server I use plus an active matrix server and some other stuff, saw something like 80TB of bandwidth in a month if I remember correctly. That's 4x the amount you get included with hetzner's $5/mo VPS
My memory must be messed up -- I obviously don't remember what it was, I just remember that it was an impressive amount from my PoV. My server was serving up less than 100GB/mo. But i dont serve any social media stuff.
I was planning on charging $0.01 per GB for greenhouse (DigitalOcean prices) but it would have made your situation cost-prohibitive. I also didn't know about the cheap bandwidth on hetzner.
I would love to be able to make greenhouse into a bargain basement "efficient market" for bandwidth if possible, still working on figuring out what that would look like. But I can definitely do better than $0.01/GB. With hetzner prices its about $0.50/TB, 20x cheaper.
Just echoing that the cost of the physical server itself is definitely the lower concern compared to bandwidth costs and (perhaps also) cost for rent that the server is stored in.
I tried to set up a video hosting site in the past and ran into the issue of how much bandwidth it used.
I am interested in the potential of Peertube though (due to it piggybacking on the WebTorrent torrent protocol in order to ease the bandwidth strain, in theory)
Beyond that even, the physical network itself is owned and run by phone companies.
On a small scale, you can set up your own LAN between computers to allow data transfer between those computers without any concern over bandwidth costs.
If a big enough movement existed you could (in theory) build a mesh network (via wifi routers) and bypass the phone companies completely.
When it comes to big apps you mention (like Tiktok) one thing you have to remember is that most of these sites tend to run at a loss for many years before making profits. They only survive because the have the backing of silicon valley investors (or some other funder) to keep them running, based on their potential future profits.
the fediverse being decentralised does give it the potential to compete though (similar to how Bernie Sanders got funds through lots of small donations rather than big donations)
A collective effort to offer federated social media to anarchist collectives and individuals in the fediverse. Registrations are open. Kolektiva.social is made by anarchists and anti-colonialists, for the social movements and for liberation!