My personal VPS was recently inundated with 800GB/month of traffic from AI scraper bots. Upon a bit of investigation they were getting stuck in some deep git history pages. I looked into Anubis and the like, but making carveouts for API endpoints seemed complicated.
Luckily the Gitea devs had recently implemented `REQUIRE_SIGNIN_VIEW = expensive` as a fix. It was minimally invasive for regular users, most pages can still be accessed without login, and it completely solved the AI bot problem, my traffic and load averages are back to normal.
Thank you Gitea devs for a great product, happy user for over a decade both personally and professionally.
I didn't know about this option, thanks. I had the same issue and solved it the hard way: I blackholed IP addresses from a bunch of ASNs (openai, microsoft, mistral).
I love Forgejo. I recently started a project to exit my business (and eventually personal) git from Github. Gitea was my target having ruled out GitLab based on prior experience administering an instance, but I ended up going with a Forgejo and I am glad I did. The Gitea shenanigans around the for-profit entity and its opaque ownership structure were mainly what left a bad taste in my mouth, but there were a few other more minor factors that were use case specific. Fedora recently decided to switch to Forgejo, which is quite a feather in their cap.
I also was somewhat skeptical that a git hosting platform that had a business behind it with enterprise oriented offerings wasn’t yet self-hosting in the technical sense.
Same here. Forgejo is amazing and their development velocity is soaring. And https://codeberg.org is a great host for FOSS projects, in a way I wished Sourcehut would've been except that it leaned hard into some (to me) strange workflow choices.
> The Gitea project is still community-driven and has the same yearly elections for leadership that has been around for close to a decade now :)
[1] mentions changes to the election process that mandates half of the oversight committee to be appointed by the Gitea company. Doesn't that conflict with your assertion that the "same yearly elections" have been around?
Where can one find the governance charter for the Gitea project?
> Gitea Enterprise is an offering of CommitGo, not the Technical Oversight Committee of Gitea or the Gitea project itself. CommitGo remains committed to contributing back functionality to Gitea under the MIT license.
Yup, this is the case. I'm the main author on that PR. It sadly stalled due to reviews from other maintainers requiring it to be rewritten using another library, but hopefully I'll be able to get back to it, or someone else will be able to pick it up. We've been able to get other functionality into Gitea already, and I've personally funded maintainers and others' work for the project, which goes directly into the project itself.
I'm the main author of the PR to implement SAML in Gitea, and it sadly has stalled due to reviews from maintainers requiring it to be rewritten entirely using another library. Our governance charter requires a certain process for PRs going into Gitea, and cannot be side-stepped by anyone. As for some of the others, we've been able to merge them in already.
SAML was just an example - I didn't see the PR before I made that post. That said, it feels fundamentally incompatible to a business strategy where your community edition is able to offer all of the features of the premium offering. I just can't see how that business would be able to survive if they allow that to happen.
I'm always dubious of freemium software, because the free version is always gimped in some way, be it SSO compatibility (OK, yours supports OIDC it seems so that's not _terrible_), role-based access controls, high availability, etc.
I will concede that businesses probably _should_ be paying for good software that is critical to their business to help support the vendors, but given how important cost savings are to companies these days, one can hardly blame engineers looking for cheaper offerings.
The difference between the Gitea project and the Gitea Enterprise software offering is with Gitea Enterprise we are able to include code written that was rejected by maintainers (eg. mandatory 2FA as an example) as there was still a desire for it. Luckily it was since rewritten in a way that was acceptable for the project and now it's been accepted/merged. The company has also written code that was under contract from other companies, and so they own the IP and thus cannot be accepted by the project due to not being able to be DCO compliant. Those companies are receptive to open-source, and we are working with their legal teams to be able to have them release their claim to the code so we can submit it to the project (large corpos are not known for their speed and understandably want to do their due diligence to ensure that all i's are dotted and t's crossed). There are around ~50 community maintainers that have exactly equal say over PR reviews, etc.., and that process has always been strictly adhered to.
Edit: Gitea has LDAP, OAuth2/OIDC, OpenID, SMTP, reverse proxy, and others as SSO options.
I agree with your last point, but as someone who co-owns a technology business that doesn’t have an “Enterprise” sized bank account, I still have all of those needs.
The SSO tax in particular is ridiculous.
Functionality like HA or SSO being gated behind enterprise licenses only makes it harder for smaller businesses to “get there”. My business is comprised exclusively of technology professionals. We tend to be really cheap customers to have because we typically only raise a ticket when something beyond our responsibility breaks.
And from the community side — I already have enough credentials to maintain in my personal life. It’s annoying when you can’t use SSO with a community edition product. I like having SSO at home. It makes life so much better, and it also makes me more likely to use a product in my business, which makes it more likely I’ll buy a license to backstop support.
Gitea has SSO using many different ways, such as LDAP, OAuth2/OIDC, OpenID, SMTP, etc.., and it would have SAML too (I'm the main author on the SAML PR to the Gitea project), but it's been held up by community reviews requiring esentially an entire re-write with another library. We'd love some help to get it across the finish line :) In open-source, money isn't the only thing that can be spent; we can also use our time.
> what happens when a community member wants to implement SAML for the community edition
It's surely just business model, but I was intrigued and thought that maybe there were some kind of incompatible licensing in popular libraries people use for these so-called "premium features"
Gitea would have SAML too (I'm the main author on the SAML PR to the Gitea project), but it's been held up by community reviews requiring esentially an entire re-write with another library. We'd love some help to get it across the finish line :) In open-source, money isn't the only thing that can be spent; we can also use our time.
How do you feel about other companies potentially also hosting gitea for third parties?
Also, I’m curious about xorm and how you guys are using your internal database. Is it atypical to perform database operations outside of gitea or integrate with eg a third party users table?
Yes there are :) You can use the Package limit setting to change it (search the config docs for `LIMIT_SIZE_CONTAINER`), by default there is no limit, but if you are running into a 413 due to container uploads being so large, then it could be a reverse proxy configuration you might be running into.
I liked it, it was pretty cool and seemed to be pretty comparable to Github, but I ended up just moving back to Github since I didn't really want to run my own infrastructure for a git repo.
Still, I would definitely consider it if I were running a company; if nothing else it wouldn't be scanned by Microsoft for training.
I accidentally allowed unrestricted signups on my publicly accessible gitea instance and came back 6 months later to 20,000 accounts hosting spam and malware. Oops. Cleanup required some mysql queries and the cli. Of course its important to pay careful attention to the configuration of any app, I'm just sharing the story of how I stubbed my toe on this furniture. :)
My instance is mostly used for archiving / mirroring interesting repos, more so since I had a glancing brush with censorship on github: a contributor to one of my repos was banned, which means entire issues and discussions and PRs they started were vanished overnight. This person was prolific and opened a lot of issues, so my repo became a graveyard of broken references and missing threads with conclusions and plans I no longer remember. Despite the minor scale of my project, this incident was rage inducing; it felt like github rebased my master branch to remove historical commits because someone was offended. Completely inappropriate imo.
For self-hosting an archival-oriented mirror, a few features would be nice:
1. Automatically mirror every repo I star on github
2. Continuously mirror issues, discussions, and PRs
3. "safe" mirroring (see #14076), so non-ff/force-push head updates have the old head tagged to preserve history
Love Gitea. Took less than an hour to get an dockerized instance of it running on my Debian VPS to handle syncing my Obsidian notes between smartphone, laptop, etc.
Recently started using Gitea and have two main questions:
What is the scoop on the schism leading to forgejo? Like, the actual reason - is it just the existence of a for profit company with partial governance over gitea or is there more of a story? And does forgejo have substantially different plans for feature development vs gitea?
Secondly, how do get in contact with contributors for sponsored work? Ideally that would be the maintainers but I feel like they have a conflict of interest with anybody trying to offer gitea to third parties…
> What is the scoop on the schism leading to forgejo? Like, the actual reason
My 2c as an outside observer - It is all about sponsored work.
Gitea long wanted a CI feature but from the outside, all anyone could see, was a Drone/Woodpecker integration. Codeberg started to spend a lot of time investing in this.
Then one day, Lunny(? i think) appeared suddenly with a fully compliant and working Github Actions CI implementation. The development had been done under a private sponsored contract.
It's great that Actions was managed to be open sourced. It's significantly better. But Codeberg really took it the wrong way and started agitating and sponsoring a fork. Nobody wants to be left in the dark.
There is a huge amount of interest in Gitea (and its forks). Everyone wants this to remain MIT, and it obviously will since there's no CLA. IMO all the "gitea company" stuff is about having a better legal structure for contract work on big features like that. That contracting is happening anyway so it may as well have a good legal structure.
Forgejo PR managed to twist that good announcement into seeming as a conflict of interest, because the "Gitea" name was reused for two different concepts. Now that it's CommitGo as the (legally independent) contract development agency, it's much clearer. There is a Gitea company as well but it just needs to hold trademarks and domain names and cloud stuff.
It's really a story of some great developers maturing into a more sophisticated legal and contractual level. The model is quite good, similar to e.g. Debian being the community project and Freexian being one of many commercial contractors for it.
Anyways, compared to Forgejo today, Gitea has the most development activity, all of the core developers, and Forgejo have given up tracking Gitea's main branch and are now adrift. Best of luck to them.
> Secondly, how do get in contact with contributors for sponsored work?
CommitGo is the legal vehicle for contracting the core developers. For other contributors, bounties are managed via https://algora.io/go-gitea/home
> Forgejo PR managed to twist that good announcement into seeming as a conflict of interest, because the "Gitea" name was reused for two different concepts. Now that it's CommitGo as the (legally independent) contract development agency, it's much clearer. There is a Gitea company as well but it just needs to hold trademarks and domain names and cloud stuff.
Isn't the Gitea company for-profit? Wasn't the leadership committee restructured to mandate half of its members are elected by the for-profit?
Browsing the blog archives, there doesn't appear to be any indication that the concerns that were brought up around the incorporation of the for-profit have been resolved.
I'm a big fan of Gitea. Incredibly easy to setup with Docker and is fast. As a user, it's incredible the difference in responsiveness in a GitLab instance, and a Gitea instance.
Thanks so much for saying so. Not sure if you are on the latest major release yet, but hopefully you've seen that resource usage is much lower and response times are even faster.
I am! I have my container auto-update so I'm always up to date :) I personally haven't seen decreased response times, though that's likely because my personal instance was already around 30-60ms to generate the page, pretty darn quick regardless!
My solution for this is a ~20 line shell script that gets run from a checked out copy of the repo under /var/www/html and uses inotifywatch on the origin repo (pushed via ssh) to update an HTML file with a tree view and links to all the files in master as well as a second HTML file with per-branch diffs from master. Then it runs "build.sh" and archives the output.
IMO this covers pretty much all the bases, it just doesn't have a flashy GUI. There's way less to configure though and one of the worst ways to spend time is configuring other people's software.
That's solving a different, and far easier, problem.
I'm perfectly happy setting up my Git repos on a fileserver I have to access via SSH. That's easy enough. It works fine. But it falls down when I want to share my code with my buddy, and now I have to make a user for him. Or suppose I want some code to be world-readable because it's not sensitive and I want to clone it onto VMs that I don't want to configure to SSH into my dev server. Or I want to put a sensitive repo behind some kind of authorization, and I want full read-write access to it, but I only want my pal to have read-only access.
You can do all these things yourself using standard Unix tools. I've done it. It's possible, but wow, it's way more of a pain in the neck than just installing Forgejo and saying "put repo A behind authentication, make repo B publicly read-only, and grant my clumsy friend read-only access to repo B (but allow him to open a PR if he wants to make a change to it)". Those are all real-world things I want to do, and Forgejo and friends are way easier to configure correctly than Unix permissions and a handful of SSH pub keys.
We are shipping ~300 PRs monthly (plus more getting reviewed), so there are always new things :) I have a few big PRs for longstanding feature requests that should go in soon that I'm pretty excited for.
edit: Im also pretty excited about the anti-crawler enhancements that went in the latest major release
Luckily the Gitea devs had recently implemented `REQUIRE_SIGNIN_VIEW = expensive` as a fix. It was minimally invasive for regular users, most pages can still be accessed without login, and it completely solved the AI bot problem, my traffic and load averages are back to normal.
Thank you Gitea devs for a great product, happy user for over a decade both personally and professionally.