Reducing barriers in repository discoverability and search

Adrian Edwards photoace12 at icloud.com
Sat Jul 2 00:03:52 UTC 2022


Hello,

I recently saw your GiveUpGithub page and wanted to share some of my 
thoughts/experiences and contribute an idea that, at least for me and my 
repositories, would be a very large encouraging factor in favor of 
abandoning GitHub.

*Some background:* I have been a GitHub user for pretty much the entire 
time I have been developing software and am definitely not a fan of the 
direction that things seem to be heading. In addition to the many 
reasons on the site, I would also like to add that  while on the site 
the other day, I //recently noticed that they had a blog post 
(https://github.blog/changelog/2022-06-28-transitioning-from-mapbox-to-azure-maps) 
discussing a change they were making on behalf of all their users 
(except their Enterprise Cloud customers) that would replace Map Box 
maps with Azure maps. While not as egregious as other actions they have 
taken, this in particular put a bit of a sour taste in my mouth as this 
appears to me to be a strong signal that they are no longer the 
community-supporting organization and are instead turning their octopus 
tentacles inwards to put up barriers around their platform.

I have considered switching to other platforms in the past, particularly 
when GitHub was acquired, but I hesitated a bit too long and never went 
through with it. I had some secondary concerns relating to the time and 
effort it would take to migrate not only the repository (issues, project 
boards, repository descriptions .etc) but also the projects themselves 
(updating README's, re-releasing libraries to update the URL and 
README's there .etc) as well as various communities associated with them 
(i.e. GitHub pages hosted sites without a custom domain). Thankfully 
these concerns are less of an issue as many of my projects are 
comparatively small.

My primary concern with switching was repository discoverability. As 
GitHub is the biggest code-hosting platform, I have almost always 
deferred to it as a way to find open source alternatives to software 
since a search on their site was the most likely to bring me what I 
wanted. I imagine this usecase for GitHub search is not unique among 
GitHub users. With what i can only imagine is a pretty huge number of 
people relying on the GitHub search box, simply being on GitHub gives 
you a far better chance of being caught up in someones search than any 
other code hosting platform (as far as I'm aware, please prove me wrong 
here).


*The idea:*

Based on my experience, I believe the best way to encourage people to 
switch is to:

 1. create a FOSS (obviously) universal search service covering as many
    of the free public facing source-hosting providers/forges out there
    as possible
 2. provide these forges with the means/tools to create simple, 1-click
    migration options. Maybe this is tools that work with GitHub's API
    to extract data not stored in a repo (like descriptions, issues,
    discussions, and project cards) or maybe its a tool to help you
    update the URL's in your repository, or sync your repo with GitHub
    to ease the transition. In any case, I am considering this out of
    scope for the sake of the length of this email.

In discussing with some groups I'm in, I believe there may already be 
some prior art on the topic of #1. If you know of anything else, I'd 
love to hear about it:

  * https://grep.app/ - appears to be an alternative GitHub search tool
    and pretty much spot on as far as the idea in my head. No apparent
    source is available though
  * https://github.com/hound-search/hound - a search service - I don't
    know much about this but it could be good as a candidate for the
    "engine" that actually does the search
  * https://github.com/livegrep/livegrep - another "engine" candidate


While fundamentally different from the core idea, I have also seen 
platforms that aim to surface projects in some particular category. This 
can vary based on the platforms goals, but includes things like:

  * https://ovio.org/ - aims to promote projects that are in need of
    contributors
  * https://civictech.guide/ - "the world’s biggest collection of
    projects using tech for the common good"
  * pretty much any of the "awesome lists" and similar curated projects
    that exist, see https://project-awesome.org/ for a bunch of them.
    Things like http://arewelearningyet.com/  also fall into this group

There are probably many others like this as well. I think these sites 
can also be useful data sources, but will need (comparatively) a lot 
more work to ensure that only projects with source code available (or 
strictly FOSS if you prefer) make it to the site.

Overall I would be very interested in building what is essentially a 
web-crawling-and-indexing system for Open source and/or FOSS projects to 
allow people to maintain (or better yet improve) the discoverability of 
their project and make it easier to find other peoples work to build on 
top of.

So far this is just an idea, and at this point here are some of the 
questions on my mind:

 1. What is the goal of this project? Is it to provide a good non-github
    search? to provide the best cross-platform repository indexing
    service? to be a search engine for FOSS projects?
 2. How connected to the GiveUpGithub effort should the project be?
 3. Given the answer to #1, since GitHub is itself a code forge/hosting
    platform, should GitHub repositories be indexed?
 4. If GitHub repositories are indexed, could there be measures taken
    that also preserve the mission of GiveUpGithub? like de-prioritizing
    GitHub repos over others, or allowing self-hosters to customize the
    results their instance serves?
 5. If GitHub repositories are not indexed, would this action be
    unfairly detrimental to users who host their projects on GitHub or
    to users who dont care about the ideas behind GiveUpGithub and just
    want to find repositories?
 6. Would it be better to design the site in a centralized (i.e. search
    engine) or decentralized (federation of some kind idk) way?
      * would a federated design make it harder for users if many
        instances exist with differing result quality?
      * could a federated design give instances the power to essentially
        access the same global set of results, but allow them to filter
        it how they want (i.e. if one instance wants to search
        everything, and another wants only FOSS or only source available
        repositories)
 7. What are the moderation needs of such a platform? how can this be
    dealt with sustainably?
 8. How can this entire project be sustainable as far as hosting costs,
    compensating contributors .etc?

While the initial concept of the project is based on the premise of 
allowing people to give up GitHub, I believe staying open-minded and 
thinking of all use cases, even those of people who don't agree with 
GiveUpGitHub, is the best way to approach, regardless of the specific 
outcomes of these decisions.

Being able to build on top of the work of others is pretty fundamental 
to the world of open source and has saved me so much time that I would 
have spent reinventing the wheel had I not found someone else who had 
done it (or something similar) before me. I would dearly love to live in 
a world of near-perfect code discoverability where anyone who can 
benefit from my code (or the code of others) is able to find it and 
use/build upon it instead of starting from scratch.

While I don't have the time to single-handedly develop my interpretation 
of what something like this might look like, I'd love to contribute what 
I can to a group effort of some kind to help make something similar to 
this happen.

Hope this email wasn't too long or incoherent,
Adrian


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/give-up-github/attachments/20220701/ffc42b05/attachment-0001.html>


More information about the Give-Up-GitHub mailing list