Reducing barriers in repository discoverability and search
Adrian Edwards
photoace12 at icloud.com
Sat Jul 2 00:03:52 UTC 2022
Hello,
I recently saw your GiveUpGithub page and wanted to share some of my
thoughts/experiences and contribute an idea that, at least for me and my
repositories, would be a very large encouraging factor in favor of
abandoning GitHub.
*Some background:* I have been a GitHub user for pretty much the entire
time I have been developing software and am definitely not a fan of the
direction that things seem to be heading. In addition to the many
reasons on the site, I would also like to add that while on the site
the other day, I //recently noticed that they had a blog post
(https://github.blog/changelog/2022-06-28-transitioning-from-mapbox-to-azure-maps)
discussing a change they were making on behalf of all their users
(except their Enterprise Cloud customers) that would replace Map Box
maps with Azure maps. While not as egregious as other actions they have
taken, this in particular put a bit of a sour taste in my mouth as this
appears to me to be a strong signal that they are no longer the
community-supporting organization and are instead turning their octopus
tentacles inwards to put up barriers around their platform.
I have considered switching to other platforms in the past, particularly
when GitHub was acquired, but I hesitated a bit too long and never went
through with it. I had some secondary concerns relating to the time and
effort it would take to migrate not only the repository (issues, project
boards, repository descriptions .etc) but also the projects themselves
(updating README's, re-releasing libraries to update the URL and
README's there .etc) as well as various communities associated with them
(i.e. GitHub pages hosted sites without a custom domain). Thankfully
these concerns are less of an issue as many of my projects are
comparatively small.
My primary concern with switching was repository discoverability. As
GitHub is the biggest code-hosting platform, I have almost always
deferred to it as a way to find open source alternatives to software
since a search on their site was the most likely to bring me what I
wanted. I imagine this usecase for GitHub search is not unique among
GitHub users. With what i can only imagine is a pretty huge number of
people relying on the GitHub search box, simply being on GitHub gives
you a far better chance of being caught up in someones search than any
other code hosting platform (as far as I'm aware, please prove me wrong
here).
*The idea:*
Based on my experience, I believe the best way to encourage people to
switch is to:
1. create a FOSS (obviously) universal search service covering as many
of the free public facing source-hosting providers/forges out there
as possible
2. provide these forges with the means/tools to create simple, 1-click
migration options. Maybe this is tools that work with GitHub's API
to extract data not stored in a repo (like descriptions, issues,
discussions, and project cards) or maybe its a tool to help you
update the URL's in your repository, or sync your repo with GitHub
to ease the transition. In any case, I am considering this out of
scope for the sake of the length of this email.
In discussing with some groups I'm in, I believe there may already be
some prior art on the topic of #1. If you know of anything else, I'd
love to hear about it:
* https://grep.app/ - appears to be an alternative GitHub search tool
and pretty much spot on as far as the idea in my head. No apparent
source is available though
* https://github.com/hound-search/hound - a search service - I don't
know much about this but it could be good as a candidate for the
"engine" that actually does the search
* https://github.com/livegrep/livegrep - another "engine" candidate
While fundamentally different from the core idea, I have also seen
platforms that aim to surface projects in some particular category. This
can vary based on the platforms goals, but includes things like:
* https://ovio.org/ - aims to promote projects that are in need of
contributors
* https://civictech.guide/ - "the world’s biggest collection of
projects using tech for the common good"
* pretty much any of the "awesome lists" and similar curated projects
that exist, see https://project-awesome.org/ for a bunch of them.
Things like http://arewelearningyet.com/ also fall into this group
There are probably many others like this as well. I think these sites
can also be useful data sources, but will need (comparatively) a lot
more work to ensure that only projects with source code available (or
strictly FOSS if you prefer) make it to the site.
Overall I would be very interested in building what is essentially a
web-crawling-and-indexing system for Open source and/or FOSS projects to
allow people to maintain (or better yet improve) the discoverability of
their project and make it easier to find other peoples work to build on
top of.
So far this is just an idea, and at this point here are some of the
questions on my mind:
1. What is the goal of this project? Is it to provide a good non-github
search? to provide the best cross-platform repository indexing
service? to be a search engine for FOSS projects?
2. How connected to the GiveUpGithub effort should the project be?
3. Given the answer to #1, since GitHub is itself a code forge/hosting
platform, should GitHub repositories be indexed?
4. If GitHub repositories are indexed, could there be measures taken
that also preserve the mission of GiveUpGithub? like de-prioritizing
GitHub repos over others, or allowing self-hosters to customize the
results their instance serves?
5. If GitHub repositories are not indexed, would this action be
unfairly detrimental to users who host their projects on GitHub or
to users who dont care about the ideas behind GiveUpGithub and just
want to find repositories?
6. Would it be better to design the site in a centralized (i.e. search
engine) or decentralized (federation of some kind idk) way?
* would a federated design make it harder for users if many
instances exist with differing result quality?
* could a federated design give instances the power to essentially
access the same global set of results, but allow them to filter
it how they want (i.e. if one instance wants to search
everything, and another wants only FOSS or only source available
repositories)
7. What are the moderation needs of such a platform? how can this be
dealt with sustainably?
8. How can this entire project be sustainable as far as hosting costs,
compensating contributors .etc?
While the initial concept of the project is based on the premise of
allowing people to give up GitHub, I believe staying open-minded and
thinking of all use cases, even those of people who don't agree with
GiveUpGitHub, is the best way to approach, regardless of the specific
outcomes of these decisions.
Being able to build on top of the work of others is pretty fundamental
to the world of open source and has saved me so much time that I would
have spent reinventing the wheel had I not found someone else who had
done it (or something similar) before me. I would dearly love to live in
a world of near-perfect code discoverability where anyone who can
benefit from my code (or the code of others) is able to find it and
use/build upon it instead of starting from scratch.
While I don't have the time to single-handedly develop my interpretation
of what something like this might look like, I'd love to contribute what
I can to a group effort of some kind to help make something similar to
this happen.
Hope this email wasn't too long or incoherent,
Adrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/give-up-github/attachments/20220701/ffc42b05/attachment-0001.html>
More information about the Give-Up-GitHub
mailing list