<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dw="https://www.dreamwidth.org">
  <id>tag:dreamwidth.org,2009-05-21:377446</id>
  <title>Ian Jackson</title>
  <subtitle>Ian Jackson</subtitle>
  <author>
    <name>Ian Jackson</name>
  </author>
  <link rel="alternate" type="text/html" href="https://diziet.dreamwidth.org/"/>
  <link rel="self" type="text/xml" href="https://diziet.dreamwidth.org/data/atom"/>
  <updated>2025-12-21T23:24:29Z</updated>
  <dw:journal username="diziet" type="personal"/>
  <entry>
    <id>tag:dreamwidth.org,2009-05-21:377446:20436</id>
    <link rel="alternate" type="text/html" href="https://diziet.dreamwidth.org/20436.html"/>
    <link rel="self" type="text/xml" href="https://diziet.dreamwidth.org/data/atom/?itemid=20436"/>
    <title>Debian’s git transition</title>
    <published>2025-12-21T23:08:31Z</published>
    <updated>2025-12-21T23:24:29Z</updated>
    <category term="debian"/>
    <category term="tag2upload"/>
    <category term="git"/>
    <category term="dgit"/>
    <dw:security>public</dw:security>
    <dw:reply-count>0</dw:reply-count>
    <content type="html">&lt;p&gt;tl;dr:
&lt;p&gt;There is a Debian git transition plan. It&amp;rsquo;s going OK so far but we need help, especially with outreach and updating Debian&amp;rsquo;s documentation.
&lt;ul&gt;&lt;li&gt;&lt;a href="#goals-of-the-debian-git-transition-project"&gt;Goals of the Debian git transition project&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#achievements-so-far-and-current-status"&gt;Achievements so far, and current status&lt;/a&gt;
&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#core-engineering-principle"&gt;Core engineering principle&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#correspondence-between-dsc-and-git"&gt;Correspondence between dsc and git&lt;/a&gt;
&lt;li&gt;&lt;a href="#patches-applied-vs-patches-unapplied"&gt;Patches-applied vs patches-unapplied&lt;/a&gt;
&lt;li&gt;&lt;a href="#consequences-some-of-which-are-annoying"&gt;Consequences, some of which are annoying&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#distributing-the-source-code-as-git"&gt;Distributing the source code as git&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#tracking-the-relevant-git-data-when-changes-are-made-in-the-legacy-archive"&gt;Tracking the relevant git data, when changes are made in the legacy Archive&lt;/a&gt;
&lt;li&gt;&lt;a href="#why-.dgit.debian.org-is-not-salsa"&gt;Why *.dgit.debian.org is not Salsa&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#roadmap"&gt;Roadmap&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#in-progress"&gt;In progress&lt;/a&gt;
&lt;li&gt;&lt;a href="#future-technology"&gt;Future Technology&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#mindshare-and-adoption---please-help"&gt;Mindshare and adoption - please help!&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#a-rant-about-publishing-the-source-code"&gt;A rant about publishing the source code&lt;/a&gt;
&lt;li&gt;&lt;a href="#documentation"&gt;Documentation&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#personnel"&gt;Personnel&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#thanks"&gt;Thanks&lt;/a&gt;
&lt;/li&gt;&lt;/ul&gt;

&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h1&gt;&lt;a name="goals-of-the-debian-git-transition-project"&gt;Goals of the Debian git transition project&lt;/a&gt;&lt;/h1&gt;
&lt;ol start="0" type="1"&gt;&lt;li&gt;&lt;strong&gt;Everyone who interacts with Debian source code should be able to do so entirely in git.&lt;/strong&gt;
&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;That means, more specifically:
&lt;ol type="1"&gt;&lt;li&gt;&lt;p&gt;All examination and edits to the source should be performed via normal git operations.

&lt;li&gt;&lt;p&gt;Source code should be transferred and exchanged as git data, not tarballs. git should be the canonical form everywhere.

&lt;li&gt;&lt;p&gt;Upstream git histories should be re-published, traceably, as part of formal git releases published by Debian.

&lt;li&gt;&lt;p&gt;No-one should have to learn about Debian Source Packages, which are bizarre, and have been obsoleted by modern version control.

&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;This is very ambitious, but we have come a long way!
&lt;h2&gt;&lt;a name="achievements-so-far-and-current-status"&gt;Achievements so far, and current status&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We have come a very long way. But, there is still much to do - especially, the git transition team &lt;strong&gt;needs your help with adoption, developer outreach, and developer documentation overhaul.&lt;/strong&gt;
&lt;p&gt;We&amp;rsquo;ve made big strides towards goals 1 and 4. Goal 2 is partially achieved: we currently have dual running. Goal 3 is within our reach but depends on widespread adoption of tag2upload (and/or dgit push).
&lt;p&gt;Downstreams and users can &lt;a href="https://diziet.dreamwidth.org/17579.html"&gt;obtain the source code of any Debian package&lt;/a&gt; in git form. (&lt;a href="https://manpages.debian.org/trixie/dgit/dgit.1.en.html#dgit"&gt;dgit clone&lt;/a&gt;, 2013). They can then work with this source code completely in git, including building binaries, merging new versions, even automatically (eg &lt;a href="https://github.com/plugwash/autoforwardportergit?tab=readme-ov-file#pooltogit"&gt;Raspbian&lt;/a&gt;, 2016), and all without having to deal with source packages at all (eg &lt;a href="https://debconf25.debconf.org/talks/119-automating-downstream-debian-package-builds-and-updates-in-ci/"&gt;Wikimedia&lt;/a&gt; 2025).
&lt;p&gt;A Debian maintainer can maintain their own package entirely in git. They can obtain upstream source code from git, and do their packaging work in git (&lt;code&gt;git-buildpackage&lt;/code&gt;, 2006).
&lt;p&gt;Every Debian maintainer can (and should!) release their package &lt;em&gt;from git&lt;/em&gt; reliably and in a standard form (&lt;a href="https://manpages.debian.org/trixie/dgit/dgit.1.en.html#dgit~15"&gt;dgit push&lt;/a&gt;, 2013; &lt;a href="https://wiki.debian.org/tag2upload"&gt;tag2upload&lt;/a&gt;, 2025). This is not only more principled, but also more convenient, and with better UX, than pre-dgit tooling like &lt;code&gt;dput&lt;/code&gt;.
&lt;p&gt;Indeed a Debian maintainer can now often release their changes to Debian, from git, using &lt;em&gt;only&lt;/em&gt; git branches (so no tarballs). Releasing to Debian can be simply pushing a signed tag (&lt;a href="https://wiki.debian.org/tag2upload"&gt;tag2upload&lt;/a&gt;, 2025).
&lt;p&gt;A Debian maintainer can maintain a stack of changes to upstream source code in git (&lt;a href="https://manpages.debian.org/trixie/git-buildpackage/gbp-pq.1.en.html"&gt;gbp pq&lt;/a&gt; 2009). They can even maintain such a delta series as a rebasing git branch, directly buildable, and use normal &lt;code&gt;git rebase&lt;/code&gt; style operations to edit their changes, (&lt;a href="https://manpages.debian.org/trixie/git-dpm/git-dpm.1.en.html"&gt;git-dpm&lt;/a&gt;, 2010; &lt;a href="https://manpages.debian.org/trixie/git-debrebase/git-debrebase.1.en.html"&gt;git-debrebase&lt;/a&gt;, 2018)
&lt;p&gt;An authorised Debian developer can do a modest update to &lt;em&gt;any&lt;/em&gt; package in Debian, even one maintained by someone else, working entirely in git in a &lt;a href="https://manpages.debian.org/testing/dgit/dgit-nmu-simple.7.en.html"&gt;standard and convenient way&lt;/a&gt; (dgit, 2013).
&lt;p&gt;Debian contributors can share their work-in-progress on git forges and collaborate using merge requests, git based code review, and so on. (Alioth, 2003; Salsa, 2018.)
&lt;h1&gt;&lt;a name="core-engineering-principle"&gt;Core engineering principle&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;The Debian git transition project is based on one core engineering principle:
&lt;p&gt;&lt;strong&gt;Every Debian Source Package can be losslessly converted to and from git.&lt;/strong&gt;
&lt;p&gt;In order to &lt;em&gt;transition&lt;/em&gt; away from Debian Source Packages, we need to &lt;em&gt;gateway&lt;/em&gt; between the old &lt;code&gt;dsc&lt;/code&gt; approach, and the new git approach.
&lt;p&gt;This gateway obviously needs to be bidirectional: source packages uploaded with legacy tooling like &lt;code&gt;dput&lt;/code&gt; need to be imported into a canonical git representation; and of course git branches prepared by developers need to be converted to source packages for the benefit of legacy downstream systems (such as the Debian Archive and &lt;code&gt;apt source&lt;/code&gt;).
&lt;p&gt;This bidirectional gateway is implemented in &lt;a href="https://salsa.debian.org/dgit-team/dgit"&gt;&lt;code&gt;src:dgit&lt;/code&gt;&lt;/a&gt;, and is allowing us to gradually replace dsc-based parts of the Debian system with git-based ones.
&lt;h2&gt;&lt;a name="correspondence-between-dsc-and-git"&gt;Correspondence between dsc and git&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A faithful bidirectional gateway must define an invariant:
&lt;p&gt;&lt;strong&gt;The canonical git tree, corresponding to a .dsc, is the tree resulting from &lt;code&gt;dpkg-source -x&lt;/code&gt;&lt;/strong&gt;.
&lt;p&gt;This canonical form is sometimes called the &amp;ldquo;dgit view&amp;rdquo;. It&amp;rsquo;s sometimes not the same as the maintainer&amp;rsquo;s git branch, because many maintainers are still working with &amp;ldquo;patches-unapplied&amp;rdquo; git branches. More on this below.
&lt;p&gt;(For &lt;code&gt;3.0 (quilt)&lt;/code&gt; .dscs, the canonical git tree doesn&amp;rsquo;t include the quilt &lt;code&gt;.pc&lt;/code&gt; directory.)
&lt;h2&gt;&lt;a name="patches-applied-vs-patches-unapplied"&gt;Patches-applied vs patches-unapplied&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The canonical git format is &amp;ldquo;patches applied&amp;rdquo;. That is:
&lt;p&gt;&lt;strong&gt;If Debian has modified the upstream source code, a normal git clone of the canonical branch gives the modified source tree, ready for reading and building.&lt;/strong&gt;
&lt;p&gt;Many Debian maintainers keep their packages in a different git branch format, where the changes made by Debian, to the upstream source code, are in actual &lt;code&gt;patch&lt;/code&gt; files in a &lt;code&gt;debian/patches/&lt;/code&gt; subdirectory.
&lt;p&gt;Patches-applied has a number of important advantages over patches-unapplied:
&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;It is familiar to, and doesn&amp;rsquo;t trick, outsiders to Debian&lt;/strong&gt;. Debian insiders radically underestimate how weird &amp;ldquo;patches-unapplied&amp;rdquo; is. Even expert software developers can get very confused or even &lt;a href="https://diziet.dreamwidth.org/9556.html"&gt;accidentally build binaries without security patches&lt;/a&gt;!

&lt;li&gt;&lt;p&gt;Making changes can be done with just normal git commands, eg &lt;code&gt;git commit&lt;/code&gt;. Many Debian insiders working with patches-unapplied are still using &lt;a href="https://manpages.debian.org/trixie/quilt/quilt.1.en.html"&gt;&lt;code&gt;quilt(1)&lt;/code&gt;&lt;/a&gt;, a footgun-rich contraption for working with patch files!

&lt;li&gt;&lt;p&gt;When developing, one can make changes to upstream code, and to Debian packaging, together, without ceremony. There is no need to switch back and forth between patch queue and packaging branches (as with &lt;code&gt;gbp pq&lt;/code&gt;), no need to &amp;ldquo;commit&amp;rdquo; patch files, etc. One can always edit every file and commit it with &lt;code&gt;git commit&lt;/code&gt;.

&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;The downside is that, with the (bizarre) &lt;code&gt;3.0 (quilt)&lt;/code&gt; source format, the patch files files in &lt;code&gt;debian/patches/&lt;/code&gt; must somehow be kept up to date. Nowadays though, tools like &lt;code&gt;git-debrebase&lt;/code&gt; and &lt;code&gt;git-dpm&lt;/code&gt; (and dgit for NMUs) make it very easy to work with patches-applied git branches. &lt;code&gt;git-debrebase&lt;/code&gt; can deal very ergonomically even with &lt;a href="https://salsa.debian.org/xen-team/debian-xen"&gt;big patch stacks&lt;/a&gt;.
&lt;p&gt;(For smaller packages which usually have no patches, &lt;a href="https://manpages.debian.org/trixie/dgit/dgit-maint-merge.7.en.html"&gt;plain &lt;code&gt;git merge&lt;/code&gt; with an upstream git branch&lt;/a&gt;, and a &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1007717#384"&gt;much simpler dsc format&lt;/a&gt;, sidesteps the problem entirely.)
&lt;h3&gt;&lt;a name="prioritising-debians-users-and-other-outsiders"&gt;Prioritising Debian&amp;rsquo;s users (and other outsiders)&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We want everyone to be able to share and modify the software that they interact with. That means we should make source code truly accessible, on the user&amp;rsquo;s terms.
&lt;p&gt;Many of Debian&amp;rsquo;s processes assume everyone is an insider. It&amp;rsquo;s okay that there are Debian insiders and that people feel part of something that they worked hard to become involved with. But lack of perspective can lead to software which fails to uphold our values.
&lt;p&gt;Our source code practices &amp;mdash; in particular, our determination to share properly (and systematically) &amp;mdash; are a key part of what makes Debian worthwhile at all. Like Debian&amp;rsquo;s installer, we want our source code to be useable by Debian outsiders.
&lt;p&gt;This is why we have chosen to privilege a git branch format which is more familiar to the world at large, even if it&amp;rsquo;s less popular in Debian.
&lt;h2&gt;&lt;a name="consequences-some-of-which-are-annoying"&gt;Consequences, some of which are annoying&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The requirement that the conversion be &lt;em&gt;bidirectional&lt;/em&gt;, &lt;em&gt;lossless&lt;/em&gt;, and &lt;em&gt;context-free&lt;/em&gt; can be inconvenient.
&lt;p&gt;For example, &lt;a href="https://manpages.debian.org/trixie/dgit/dgit.7.en.html#GITATTRIBUTES"&gt;we cannot support &lt;code&gt;.gitattributes&lt;/code&gt;&lt;/a&gt; which modify files during git checkin and checkout. &lt;code&gt;.gitattributes&lt;/code&gt; cause the meaning of a git tree to depend on the context, in possibly arbitrary ways, so the conversion from git to source package wouldn&amp;rsquo;t be stable. And, worse, some source packages might not to be representable in git at all.
&lt;p&gt;Another example: Maintainers often have existing git branches for their packages, generated with pre-dgit tooling which is less careful and less principled than ours. That can result in discrepancies between git and dsc, which need to be resolved before a proper git-based upload can succeed.
&lt;p&gt;That some maintainers use patches-unapplied, and some patches-unapplied, means that there &lt;em&gt;has&lt;/em&gt; to be some kind of conversion to a standard git representation. Choosing the less-popular patches-applied format as the canonical form, means that &lt;em&gt;many&lt;/em&gt; packages need their git representation converted. It also means that user- and outsider-facing branches from &lt;code&gt;{browse,git}.dgit.d.o&lt;/code&gt; and &lt;code&gt;dgit clone&lt;/code&gt; are not always compatible with maintainer branches on Salsa. User-contributed changes need cherry-picking rather than merging, or conversion back to the maintainer format. The good news is that dgit can automate much of this, and the manual parts are usually easy git operations.
&lt;h1&gt;&lt;a name="distributing-the-source-code-as-git"&gt;Distributing the source code as git&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Our source code management should be normal, modern, and based on git. That means the Debian Archive is obsolete and needs to be replaced with a set of git repositories.
&lt;p&gt;The replacement repository for source code formally released to Debian is &lt;a href="https://browse.dgit.debian.org/"&gt;&lt;code&gt;*.dgit.debian.org&lt;/code&gt;&lt;/a&gt;. This contains all the git objects for every git-based upload since 2013, including the signed tag for each released package version.
&lt;p&gt;The plan is that it will contain a git view of &lt;em&gt;every&lt;/em&gt; uploaded Debian package, by centrally importing all legacy uploads into git.
&lt;h2&gt;&lt;a name="tracking-the-relevant-git-data-when-changes-are-made-in-the-legacy-archive"&gt;Tracking the relevant git data, when changes are made in the legacy Archive&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Currently, many critical source code management tasks are done by changes to the legacy Debian Archive, which works entirely with dsc files (and the associated tarballs etc). The contents of the Archive are therefore still an important source of truth. But, the Archive&amp;rsquo;s architecture means it cannot sensibly directly contain git data.
&lt;p&gt;To track changes made in the Archive, we added the &lt;a href="https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-dgit"&gt;&lt;code&gt;Dgit:&lt;/code&gt;&lt;/a&gt; field to the &lt;code&gt;.dsc&lt;/code&gt; of a git-based upload (2013). This declares which git commit this package was converted from. and where those git objects can be obtained.
&lt;p&gt;Thus, given a Debian Source Package from a git-based upload, it is possible for the new git tooling to obtain the equivalent git objects. If the user is going to work in git, there is no need for any tarballs to be downloaded: the git data could be obtained from the depository using the git protocol.
&lt;p&gt;The &lt;a href="https://browse.dgit.debian.org/libsdl2-ttf.git/tag/?h=archive/debian/2.24.0%2bdfsg-3"&gt;signed&lt;/a&gt; &lt;a href="https://browse.dgit.debian.org/libsdl2-ttf.git/tag/?h=debian/2.24.0%2bdfsg-3"&gt;tags&lt;/a&gt;, available from the git depository, have &lt;a href="https://manpages.debian.org/trixie/git-debpush/tag2upload.5.en.html"&gt;standardised metdata&lt;/a&gt; which gives traceability back to the uploading Debian contributor.
&lt;h2&gt;&lt;a name="why-.dgit.debian.org-is-not-salsa"&gt;Why *.dgit.debian.org is not Salsa&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We need a git &lt;em&gt;depository&lt;/em&gt; - a formal, reliable and permanent git repository of source code actually released to Debian.
&lt;p&gt;Git forges like Gitlab can be very convenient. But Gitlab is not sufficiently secure, &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/429516"&gt;and&lt;/a&gt; &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/472646"&gt;too&lt;/a&gt; &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/581752"&gt;full&lt;/a&gt; &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/581897"&gt;of&lt;/a&gt; &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/217231"&gt;bugs&lt;/a&gt;, to be the principal and only archive of all our source code. (The &amp;ldquo;open core&amp;rdquo; business model of the Gitlab corporation, and the constant-churn development approach, are &lt;a href="https://mako.cc/writing/hill-free_tools.html"&gt;critical&lt;/a&gt; underlying problems.)
&lt;p&gt;Our git depository lacks forge features like Merge Requests. But:
&lt;ul&gt;&lt;li&gt;It is dependable, both in terms of reliability and security.
&lt;li&gt;It is append-only: once something is pushed, it is permanently recorded.
&lt;li&gt;Its access control is precisely that of the Debian Archive.
&lt;li&gt;Its ref namespace is standardised and corresponds to Debian releases.
&lt;li&gt;Pushes are authorised by PGP signatures, not ssh keys, so traceable.
&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;The dgit git depository outlasted Alioth and it may well outlast Salsa.
&lt;p&gt;We need &lt;em&gt;both&lt;/em&gt; a good forge, and the &lt;code&gt;*.dgit.debian.org&lt;/code&gt; formal git depository.
&lt;h1&gt;&lt;a name="roadmap"&gt;Roadmap&lt;/a&gt;&lt;/h1&gt;
&lt;h2&gt;&lt;a name="in-progress"&gt;In progress&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Right now we are quite focused on &lt;a href="https://wiki.debian.org/tag2upload"&gt;&lt;strong&gt;tag2upload&lt;/strong&gt;&lt;/a&gt;.
&lt;p&gt;We are working hard on eliminating the remaining issues that we feel need to be addressed before declaring the service out of beta.
&lt;h2&gt;&lt;a name="future-technology"&gt;Future Technology&lt;/a&gt;&lt;/h2&gt;
&lt;h3&gt;&lt;a name="whole-archive-dsc-importer"&gt;Whole-archive dsc importer&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Currently, the git depository only has git data for git-based package updates (tag2upload and dgit push). Legacy dput-based uploads are not currently present there. This means that the git-based and legacy uploads must be resolved client-side, by &lt;code&gt;dgit clone&lt;/code&gt;.
&lt;p&gt;We will want to start importing legacy uploads to git.
&lt;p&gt;Then downstreams and users will be able to get the source code for any package simply with &lt;code&gt;git clone&lt;/code&gt;, even if the maintainer is using legacy upload tools like dput.
&lt;h3&gt;&lt;a name="support-for-git-based-uploads-to-security.debian.org"&gt;Support for git-based uploads to security.debian.org&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Security patching is a task which would particularly benefit from better and more formal use of git. git-based approaches to applying and backporting security patches are much more convenient than messing about with actual patch files.
&lt;p&gt;Currently, one can use git to help prepare a security upload, but it often involves starting with a dsc import (which lacks the proper git history) or figuring out a package maintainer&amp;rsquo;s unstandardised git usage conventions on Salsa.
&lt;p&gt;And it is not possible to properly perform the security release &lt;em&gt;as git&lt;/em&gt;.
&lt;h3&gt;&lt;a name="internal-debian-consumers-switch-to-getting-source-from-git"&gt;Internal Debian consumers switch to getting source from git&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Buildds, QA work such as lintian checks, and so on, could be simpler if they don&amp;rsquo;t need to deal with source packages.
&lt;p&gt;And since git is actually the canonical form, we want them to use it directly.
&lt;h3&gt;&lt;a name="problems-for-the-distant-future"&gt;Problems for the distant future&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For decades, Debian has been built around source packages. Replacing them is a long and complex process. Certainly source packages are going to continue to be supported for the foreseeable future.
&lt;p&gt;There are no doubt going to be unanticipated problems. There are also foreseeable issues: for example, perhaps there are packages that work very badly when represented in git. We think we can rise to these challenges as they come up.
&lt;h1&gt;&lt;a name="mindshare-and-adoption---please-help"&gt;Mindshare and adoption - please help!&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;We and our users are very pleased with our technology. It is convenient and highly dependable.
&lt;p&gt;&lt;code&gt;dgit&lt;/code&gt; in particular is superb, even if we say so ourselves. As technologists, we have been very focused on building good software, but it seems we have fallen short in the marketing department.
&lt;h2&gt;&lt;a name="a-rant-about-publishing-the-source-code"&gt;A rant about publishing the source code&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;git is the preferred form for modification&lt;/strong&gt;.
&lt;p&gt;Our upstreams are overwhelmingly using git. We are overwhelmingly using git. It is a scandal that for many packages, Debian does not properly, formally and officially publish the git history.
&lt;p&gt;&lt;em&gt;Properly&lt;/em&gt; publishing the source code as git means publishing it in a way that means that anyone can &lt;em&gt;automatically&lt;/em&gt; and &lt;em&gt;reliably&lt;/em&gt; obtain &lt;em&gt;and build&lt;/em&gt; the &lt;em&gt;exact&lt;/em&gt; source code corresponding to the binaries. The test is: could you use that to build a derivative?
&lt;p&gt;Putting a package &lt;a href="https://dep-team.pages.debian.net/deps/dep18/"&gt;in git on Salsa&lt;/a&gt; is often a good idea, but it is not sufficient. No standard branch structure git on Salsa is enforced, nor should it be (so it can&amp;rsquo;t be automatically and reliably obtained), the tree is not in a standard form (so it can&amp;rsquo;t be automatically built), and is not &lt;em&gt;necessarily identical&lt;/em&gt; to the source package. So &lt;code&gt;Vcs-Git&lt;/code&gt; fields, and git from Salsa, will never be sufficient to make a derivative.
&lt;p&gt;&lt;strong&gt;Debian is not publishing the source code!&lt;/strong&gt;
&lt;p&gt;The time has come for proper publication of source code by Debian to no longer be a minority sport. Every maintainer of a package whose upstream is using git (which is nearly all packages nowadays) should be basing their work on upstream git, and properly publishing that via tag2upload or dgit.
&lt;p&gt;And it&amp;rsquo;s not even difficult! The modern git-based tooling provides a far superior upload experience.
&lt;h3&gt;&lt;a name="a-common-misunderstanding"&gt;A common misunderstanding&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;dgit push is not an alternative to gbp pq or quilt. Nor is tag2upload.&lt;/strong&gt; These upload tools &lt;em&gt;complement your existing git workflow&lt;/em&gt;. They replace and improve source package building/signing and the subsequent dput. If you are using one of the usual git layouts on salsa, and your package is in good shape, you can adopt tag2upload and/or dgit push right away.
&lt;p&gt;&lt;code&gt;git-debrebase&lt;/code&gt; is distinct and &lt;em&gt;does&lt;/em&gt; provides an alternative way to manage your git packaging, do your upstream rebases, etc.
&lt;h2&gt;&lt;a name="documentation"&gt;Documentation&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Debian&amp;rsquo;s documentation all needs to be updated, including particularly instructions for packaging, to recommend use of git-first workflows. Debian should not be importing git-using upstreams&amp;rsquo; &amp;ldquo;release tarballs&amp;rdquo; into git. (Debian outsiders who discover this practice are typically horrified.) We should use &lt;em&gt;only&lt;/em&gt; upstream git, work only in git, and properly release (and publish) in git form.
&lt;p&gt;We, the git transition team, are experts in the technology, and can provide good suggestions. But we do not have the bandwidth to also engage in the massive campaigns of education and documentation updates that are necessary &amp;mdash; especially given that (as with any programme for change) many people will be sceptical or even hostile.
&lt;p&gt;So we would greatly appreciate help with writing and outreach.
&lt;h1&gt;&lt;a name="personnel"&gt;Personnel&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;We consider ourselves the Debian git transition team.
&lt;p&gt;Currently we are:
&lt;ul&gt;&lt;li&gt;&lt;p&gt;Ian Jackson. Author and maintainer of dgit and git-debrebase. Co-creator of tag2upload. Original author of dpkg-source, and inventor in 1996 of Debian Source Packages. Alumnus of the Debian Technical Committee.

&lt;li&gt;&lt;p&gt;Sean Whitton. Co-creator of the tag2upload system; author and maintainer of git-debpush. Co-maintainer of dgit. Debian Policy co-Editor. Former Chair of the Debian Technical Committee.

&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;We wear the following hats related to the git transition:
&lt;ul&gt;&lt;li&gt;Maintainers of src:dgit
&lt;li&gt;&lt;a href="https://lists.debian.org/debian-devel-announce/2025/01/msg00002.html"&gt;tag2upload Delegates&lt;/a&gt;; operators of the &lt;a href="https://tag2upload.debian.org/"&gt;tag2upload service&lt;/a&gt;.
&lt;li&gt;service operators of the git depository &lt;a href="https://browse.dgit.debian.org/"&gt;*.dgit.debian.org&lt;/a&gt;.
&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;You can contact us:
&lt;ul&gt;&lt;li&gt;&lt;p&gt;By email: Ian Jackson &lt;a class="email" href="mailto:ijackson@chiark.greenend.org.uk"&gt;ijackson@chiark.greenend.org.uk&lt;/a&gt;; Sean Whitton &lt;a class="email" href="mailto:spwhitton@spwhitton.name"&gt;spwhitton@spwhitton.name&lt;/a&gt;; git-debpush@packages.d.o.

&lt;li&gt;&lt;p&gt;By filing bugs in the Debian Bug System against &lt;a href="https://bugs.debian.org/src:dgit"&gt;src:dgit&lt;/a&gt;.

&lt;li&gt;&lt;p&gt;On OFTC IRC, as &lt;code&gt;Diziet&lt;/code&gt; and &lt;code&gt;spwhitton&lt;/code&gt;.

&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;We do most of our heavy-duty development &lt;a href="https://salsa.debian.org/dgit-team"&gt;on Salsa&lt;/a&gt;.
&lt;h2&gt;&lt;a name="thanks"&gt;Thanks&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Particular thanks are due to Joey Hess, who, in the now-famous design session in Vaumarcus in 2013, helped invent dgit. Since then we have had a lot of support: most recently political support to help get tag2upload deployed, but also, over the years, helpful bug reports and kind words from our users, as well as translations and code contributions.
&lt;p&gt;Many other people have contributed more generally to support for working with Debian source code in git. We particularly want to mention Guido G&amp;uuml;nther (git-buildpackage); and of course Alexander Wirt, Joerg Jaspert, Thomas Goirand and Antonio Terceiro (Salsa administrators); and before them the Alioth administrators.&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=diziet&amp;ditemid=20436" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
  <entry>
    <id>tag:dreamwidth.org,2009-05-21:377446:20143</id>
    <link rel="alternate" type="text/html" href="https://diziet.dreamwidth.org/20143.html"/>
    <link rel="self" type="text/xml" href="https://diziet.dreamwidth.org/data/atom/?itemid=20143"/>
    <title>tag2upload in the first month of forky</title>
    <published>2025-09-14T15:36:17Z</published>
    <updated>2025-09-14T15:36:41Z</updated>
    <category term="debian"/>
    <category term="tag2upload"/>
    <category term="git"/>
    <category term="dgit"/>
    <category term="computers"/>
    <dw:security>public</dw:security>
    <dw:reply-count>0</dw:reply-count>
    <content type="html">&lt;p&gt;tl;dr: &lt;a href="https://wiki.debian.org/tag2upload"&gt;tag2upload&lt;/a&gt; (beta) is going well so far, and is already handling around one in 13 uploads to Debian.
&lt;ul&gt;&lt;li&gt;&lt;a href="#introduction-and-some-stats"&gt;Introduction and some stats&lt;/a&gt;
&lt;li&gt;&lt;a href="#recent-uiux-improvements"&gt;Recent UI/UX improvements&lt;/a&gt;
&lt;li&gt;&lt;a href="#why-we-are-still-in-beta"&gt;Why we are still in beta&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#retrying-on-salsa-side-failures"&gt;Retrying on Salsa-side failures&lt;/a&gt;
&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#other-notable-ongoing-work"&gt;Other notable ongoing work&lt;/a&gt;
&lt;li&gt;&lt;a href="#common-problems"&gt;Common problems&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="#reuse-of-version-numbers-and-attempts-to-re-tag"&gt;Reuse of version numbers, and attempts to re-tag&lt;/a&gt;
&lt;li&gt;&lt;a href="#discrepancies-between-git-and-orig-tarballs"&gt;Discrepancies between git and orig tarballs&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;li&gt;&lt;a href="#get-involved"&gt;Get involved&lt;/a&gt;
&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ul&gt;
&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h3&gt;&lt;a name="introduction-and-some-stats"&gt;Introduction and some stats&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We announced tag2upload&amp;rsquo;s open beta in mid-July. That was in the middle of the the freeze for trixie, so usage was fairly light until the forky floodgates opened.
&lt;p&gt;Since then the service has successfully performed &lt;strong&gt;637 uploads&lt;/strong&gt;, of which 420 were in the last 32 days. That&amp;rsquo;s an average of about 13 per day. For comparison, during the first half of September up to today there have been 2475 uploads to unstable. That&amp;rsquo;s about 176/day.
&lt;p&gt;So, tag2upload is already handling around 7.5% of uploads. This is very gratifying for a service which is advertised as still being in beta!
&lt;p&gt;Sean and I are very pleased both with the uptake, and with the way the system has been performing.
&lt;h3&gt;&lt;a name="recent-uiux-improvements"&gt;Recent UI/UX improvements&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;During this open beta period we have been hard at work. We have made many improvements to the user experience.
&lt;p&gt;Current &lt;code&gt;git-debpush&lt;/code&gt; in forky, or trixie-backports, is much better at detecting various problems ahead of time.
&lt;p&gt;When uploads do fail on the service the emailed error reports are now more informative. For example, anomalies involving orig tarballs, which by definition can&amp;rsquo;t be detected locally (since one point of tag2upload is not to have tarballs locally) now generally result in failure reports containing a diffstat, and instructions for a local repro.
&lt;h3&gt;&lt;a name="why-we-are-still-in-beta"&gt;Why we are still in beta&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;There are a few outstanding work items that we currently want to complete before we declare the end of the beta.
&lt;h4&gt;&lt;a name="retrying-on-salsa-side-failures"&gt;Retrying on Salsa-side failures&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The biggest of these is that the service should be able to retry when Salsa fails. Sadly, Salsa isn&amp;rsquo;t wholly reliable, and right now if it breaks when the service is trying to handle your tag, your upload can fail.
&lt;p&gt;We think most of these failures could be avoided. Implementing retries is a fairly substantial task, but doesn&amp;rsquo;t pose any fundamental difficulties. We&amp;rsquo;re working on this right now.
&lt;h3&gt;&lt;a name="other-notable-ongoing-work"&gt;Other notable ongoing work&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We want to support pristine-tar, so that pristine-tar users can do a new upstream release. Andrea Pappacoda is working on that with us. See &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1106071"&gt;#1106071&lt;/a&gt;. (Note that we would generally &lt;strong&gt;recommend against use of pristine-tar&lt;/strong&gt; within Debian. But we want to support it.)
&lt;p&gt;We have been having conversations with &lt;a href="https://salsa.debian.org/freexian-team/debusine"&gt;Debusine&lt;/a&gt; folks about what integration between tag2upload and Debusine would look like. We&amp;rsquo;re &lt;a href="https://salsa.debian.org/freexian-team/debusine/-/issues/815#note_651533"&gt;making some progress&lt;/a&gt; there, but a lot is still up in the air.
&lt;p&gt;&lt;a href="https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/467#note_642152"&gt;We are considering&lt;/a&gt; how best to provide tag2upload pre-checks as part of Salsa CI. There are several problems detected by the tag2upload service that could be detected by Salsa CI too, but which can&amp;rsquo;t be detected by &lt;code&gt;git-debpush&lt;/code&gt;.
&lt;h3&gt;&lt;a name="common-problems"&gt;Common problems&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We&amp;rsquo;ve been monitoring the service and until very recently we have investigated every service-side failure, to understand the root causes. This has given us insight into the kinds of things our users want, and the kinds of packaging and git practices that are common. We&amp;rsquo;ve been able to improve the system&amp;rsquo;s handling of various anomalies and also improved the documentation.
&lt;p&gt;Right now our failure rate is still rather high, at around 7%. Partly this is because people are trying out the system on packages that haven&amp;rsquo;t ever seen git tooling with such a level of rigour.
&lt;p&gt;There are two classes of problem that are responsible for the vast majority of the failures that we&amp;rsquo;re still seeing:
&lt;h4&gt;&lt;a name="reuse-of-version-numbers-and-attempts-to-re-tag"&gt;Reuse of version numbers, and attempts to re-tag&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;tag2upload, like git (and like &lt;code&gt;dgit&lt;/code&gt;), hates it when you reuse a version number, or try to pretend that a (perhaps busted) release never happened.
&lt;p&gt;git tags aren&amp;rsquo;t namespaced, and tend to spread about promiscuously. So replacing a signed git tag, with a different tag of the same name, is a bad idea. More generally, reusing the same version number for a different (signed!) package is poor practice. Likewise, it&amp;rsquo;s usually a bad idea to remove changelog entries for versions which were actually released, just because they were later deemed improper.
&lt;p&gt;We understand that many Debian contributors have gotten used to this kind of thing. Indeed, tools like &lt;code&gt;dcut&lt;/code&gt; encourage it. It does allow you to make things neat-looking, even if you&amp;rsquo;ve made mistakes - but really it does so by &lt;em&gt;covering up&lt;/em&gt; those mistakes!
&lt;p&gt;The bottom line is that tag2upload can&amp;rsquo;t support such history-rewriting. If you discover a mistake after you&amp;rsquo;ve signed the tag, please just &lt;strong&gt;burn the version number and add a new changelog stanza&lt;/strong&gt;.
&lt;p&gt;One bonus of tag2upload&amp;rsquo;s approach is that it will discover if you are accidentally overwriting an NMU, and report that as an error.
&lt;h4&gt;&lt;a name="discrepancies-between-git-and-orig-tarballs"&gt;Discrepancies between git and orig tarballs&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;tag2upload promises that the source package that it generates corresponds precisely to the git tree you tag and sign.
&lt;p&gt;Orig tarballs make this complicated. They aren&amp;rsquo;t present on your laptop when you &lt;code&gt;git-debpush&lt;/code&gt;. When you&amp;rsquo;re not uploading a new upstream version, the tag2upload service reuses existing orig tarballs from the archive. If your git and the archive&amp;rsquo;s orig don&amp;rsquo;t agree, the tag2upload service will report an error, rather than upload a package with contents that differ from your git tag.
&lt;p&gt;With the most common Debian workflows, everything is fine:
&lt;p&gt;If you base everything on upstream git, and make your orig tarballs with &lt;code&gt;git archive&lt;/code&gt; (or &lt;code&gt;git deborig&lt;/code&gt;), your orig tarballs are the same as the git, by construction. &lt;strong&gt;We recommend usually ignoring upstream tarballs&lt;/strong&gt;: most upstreams work in git, and their tarballs can contain weirdness that we don&amp;rsquo;t want. (At worst, the tarball can contain an attack that isn&amp;rsquo;t visible in git, as with &lt;code&gt;xz&lt;/code&gt;!)
&lt;p&gt;Alternatively, if you use &lt;code&gt;gbp import-orig&lt;/code&gt;, the differences (including an attack like Jia Tan&amp;rsquo;s) are &lt;em&gt;imported into&lt;/em&gt; git for you. Then, once again, your git and the orig tarball will correspond.
&lt;p&gt;But there are other workflows where this correspondence may not hold. Those workflows are hazardous, because the thing you&amp;rsquo;re probably working with locally for your routine development is the git view. Then, when you upload, your work is transplanted onto the orig tarball, which might be quite different - so what you upload isn&amp;rsquo;t what you&amp;rsquo;ve been working on!
&lt;p&gt;This situation is detected by tag2upload, precisely because tag2upload checks that it&amp;rsquo;s keeping its promise: the source package is identical to the git view. (&lt;code&gt;dgit push&lt;/code&gt; makes the same promise.)
&lt;h3&gt;&lt;a name="get-involved"&gt;Get involved&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Of course the easiest way to get involved is to &lt;a href="https://wiki.debian.org/tag2upload"&gt;start using tag2upload&lt;/a&gt;.
&lt;p&gt;We would love to have more contributors. There are some easy tasks to get started with, in &lt;a href="https://bugs.debian.org/cgi-bin/pkgreport.cgi?src=dgit;tag=newcomer"&gt;bugs we&amp;rsquo;ve tagged &amp;ldquo;newcomer&amp;rdquo;&lt;/a&gt; &amp;mdash; mostly UX improvements such as detecting certain problems earlier, in &lt;code&gt;git-debpush&lt;/code&gt;.
&lt;p&gt;More substantially, we are looking for help with &lt;code&gt;sbuild&lt;/code&gt;: we&amp;rsquo;d like it to be able to work directly from git, rather than needing to build source packages: &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=868527"&gt;#868527&lt;/a&gt;.&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=diziet&amp;ditemid=20143" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
  <entry>
    <id>tag:dreamwidth.org,2009-05-21:377446:14666</id>
    <link rel="alternate" type="text/html" href="https://diziet.dreamwidth.org/14666.html"/>
    <link rel="self" type="text/xml" href="https://diziet.dreamwidth.org/data/atom/?itemid=14666"/>
    <title>Never use git submodules</title>
    <published>2023-03-02T19:48:20Z</published>
    <updated>2023-03-02T19:48:20Z</updated>
    <category term="git"/>
    <category term="computers"/>
    <dw:security>public</dw:security>
    <dw:reply-count>3</dw:reply-count>
    <content type="html">&lt;h2&gt;tl;dr&lt;/h2&gt;
&lt;p&gt;git submodules are &lt;em&gt;always the wrong solution&lt;/em&gt;. Yes, even the to the problem they were specifically invented to solve.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#what-is-wrong-with-git-submodules"&gt;What is wrong with git submodules&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#better-alternatives-to-git-submodules"&gt;Better alternatives to git submodules&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#use-git-subtree"&gt;Use git subtree&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#just-have-a-monorepo"&gt;Just have a monorepo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#use-a-package-management-system-and-explicit-dependencies"&gt;Use a package management system, and explicit dependencies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#use-the-multiple-repository-tool-mr"&gt;Use the multiple repository tool &lt;code&gt;mr&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#have-your-build-expect-to-find-the-dependency-in-..-its-parent-dir"&gt;Have your build expect to find the dependency in &lt;code&gt;..&lt;/code&gt;, its parent dir&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#provide-an-ad-hoc-in-tree-script-to-download-the-dependency"&gt;Provide an ad-hoc in-tree script to download the dependency&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h2&gt;What is wrong with git submodules&lt;/h2&gt;
&lt;p&gt;There are two principal sets of reasons why they are terrible:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fundamentally wrong design. They break the git data model in multiple ways. Critical ways include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A git object in your repository is no longer necessarily resolvable/interpetable to meaningful data. (Shallow clones have the same issue but only with respect to history. git submodules do this for the contents of the tree.)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;git submodules violate the usual rule that all URLs, hostnames, and so on, used by git, are provided by the git configuration and the user, rather than appearing in-tree.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;git submodules introduce completely new states your tree can be in, many of them strange or undesirable.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wrong behaviour in detail. git’s behaviour with submodules is often buggy or bizarre. Some of these problems are implied by the design, but many of them are additional unforced errors. Some of the defects occur even if you don’t &lt;code&gt;git submodule init&lt;/code&gt;, so affect &lt;em&gt;all&lt;/em&gt; programs and users which interact with your tree.&lt;/p&gt;
&lt;p&gt;Just a few examples of lossage with submodules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;git checkout no longer reliably switches branches&lt;/li&gt;
&lt;li&gt;editing files and trying to commit them no longer reliably works&lt;/li&gt;
&lt;li&gt;locally pulling a new version from main no longer reliably works&lt;/li&gt;
&lt;li&gt;git ls-files can disagree with git log and git cat-file&lt;/li&gt;
&lt;li&gt;URLs from .gitmodules: they can be malicious; they can end up cached in individual trees’ (individual users’) .git/config; etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Generally, normal git operations like git checkout and git pull can leave the submodule in a weird state where you have to run one of the git submodule commands to fix it up. Often the easiest way (especially for a non-expert) to get back to a normal state is to throw the whole tree away and re-clone it.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Ultimately, this means that the author of a program which works with git has two options:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;Don’t support submodules. Tell users of your program who file bugs involving submodules that they’re not supported.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Do an enormous amount of extra work: At every point you interact with git, experiment to see what bizarre behaviour submodules exhibit, and write code to deal with all the possibilities.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As a result, a substantial subset of git tooling is broken in the presence of submodules. This is especially true of local automation and tooling, which is otherwise an effective way of improving your processes. But, of course this also applies to git itself! Which is one of the causes of the bugs that git itself has when working with submodules.&lt;/p&gt;
&lt;h2&gt;Better alternatives to git submodules&lt;/h2&gt;
&lt;p&gt;In my opinion git submodule is &lt;em&gt;never&lt;/em&gt; the right answer. Often, git submodule is the &lt;em&gt;worst&lt;/em&gt; answer and &lt;em&gt;any&lt;/em&gt; of the following would be better.&lt;/p&gt;
&lt;h3&gt;Use git subtree&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://manpages.debian.org/stable/git-man/git-subtree.1.en.html"&gt;git subtree&lt;/a&gt; solves many of the same problems as git submodule, but it does not violate the git data model.&lt;/p&gt;
&lt;p&gt;Use this when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You want to track and use, in-tree, a separate project which ought to have its own identity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The separate project is of reasonable size (compared to your own).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With git subtree, people and programs that do not need to specifically interact with the upstream for the subtree, do not need to know that it even &lt;em&gt;is&lt;/em&gt; a subtree. They can make and switch branches, commit, and so on, as they like.&lt;/p&gt;
&lt;p&gt;git subtree can automatically separate out changes made in the downstream, for application to (or submission to) the upstream branch.&lt;/p&gt;
&lt;p&gt;I have used git subtree and found it capable and convenient, and pleasingly straightforward.&lt;/p&gt;
&lt;h3&gt;Just have a monorepo&lt;/h3&gt;
&lt;p&gt;If you are the upstream for all the pieces, it is often more convenient to merge the git trees into a single git tree with a single history.&lt;/p&gt;
&lt;p&gt;Use this when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The maintenance of all the pieces is &lt;em&gt;organisationally&lt;/em&gt; and &lt;em&gt;politically&lt;/em&gt; cohesive enough that you can share a git history.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The whole monorepo would be of reasonable size.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any long-running branches you need to make are for release channels, or the similar, not for having separate versions of the internal dependencies for the different pieces in the monorepo.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Use a package management system, and explicit dependencies&lt;/h3&gt;
&lt;p&gt;Instead of subsuming the dependency’s tree into your own, give the dependency a proper API and reuse it via a package management system. (If necessary, maintain a proper downstream fork of the dependency.)&lt;/p&gt;
&lt;p&gt;The package manager might be be:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;a distro-style package management system such as &lt;code&gt;apt&lt;/code&gt;+&lt;code&gt;dpkg&lt;/code&gt;+&lt;a href="https://manpages.debian.org/stable/sbuild/sbuild.1.en.html"&gt;&lt;code&gt;sbuild&lt;/code&gt;&lt;/a&gt; (or a proprietary/private dependency-managing build system); or&lt;/li&gt;
&lt;li&gt;a language specific package manager (eg &lt;code&gt;cargo&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Use this when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You are already using, or familiar with, a suitable package manager,&lt;/li&gt;
&lt;li&gt;The API provided by the dependency can be reasonably represented in that package manager (even if unstably).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Use the multiple repository tool &lt;code&gt;mr&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://manpages.debian.org/stable/myrepos/mr.1.en.html"&gt;&lt;code&gt;mr(1)&lt;/code&gt;&lt;/a&gt; is a tool which lets you conveniently manage a possibly large number of trees, usually as sibling directories.&lt;/p&gt;
&lt;p&gt;I haven’t used this myself but it looks capable and straightforward. As I understand it, you’d usually use this in combination with the &lt;code&gt;..&lt;/code&gt;-based dependency expectation I describe below.&lt;/p&gt;
&lt;p&gt;It seems like it would be good when your project has a fair number of “foreign” dependencies.&lt;/p&gt;
&lt;h3&gt;Have your build expect to find the dependency in &lt;code&gt;..&lt;/code&gt;, its parent dir&lt;/h3&gt;
&lt;p&gt;This is a very lightweight solution. Just have the files in your tree refer to the dependencies with &lt;code&gt;../dependency-name/&lt;/code&gt;. Expect users (and programs) to manually clone and update the right dependency version, alongside your project.&lt;/p&gt;
&lt;p&gt;Consider this when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Your project is at an early stage and you want to get going quickly and worry about this build system stuff later.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The dependency is disabled by default, and almost never neeeded.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Every program or human that wants to run a build that needs the dependency will need to know to clone the dependency, and keep it up to date. This will be a nuisance, and if you’re doing CI it will mean some custom CI scripting. But this is all probably still better than git submodules. At least it will be completely obvious to everyone what’s going on, how to make changes to the dependency, and so on.&lt;/p&gt;
&lt;h3&gt;Provide an ad-hoc in-tree script to download the dependency&lt;/h3&gt;
&lt;p&gt;As a last resort, you can embed the URL to find your dependency, and the instructions for downloading it, in your top-level package’s build system. This is clumsy and awkward, but, astonishingly, it is less painful than git submodules.&lt;/p&gt;
&lt;p&gt;Use this when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Most people using/building your software won’t need the dependency at all.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In particular, most people won’t need to edit the dependency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;None of the other options are suitable.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Usually the downstream build runes should git clone the dependency, and the downstream tree should name the precise commitid needed.&lt;/p&gt;
&lt;p&gt;Try to avoid this situation. It’s not a good place to be. But:&lt;/p&gt;
&lt;h4&gt;Yes, really, git submodule is worse than ad-hoc Makefile runes&lt;/h4&gt;
&lt;p&gt;The ad-hoc shell script route feels very hacky. But it has some important advantages over git submodule. In particular, unlike with git submodule, this approach (like most of the others I suggest) means that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;All tooling that expects to clone your repository, make changes, do builds, track changes, etc., will work correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You are in precise control of when/whether the download occurs: ie, you can arrange to download the dependency precisely when it’s needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You are in precise control of your version management and checking of the dependency: your script controls what version of the dependency to use, and whether that should be “pinned” or dynamically updated.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I’m not advocating ad-hoc runes over git submodules because I like ad-hoc runes or think they’re a good idea. It’s just that git submodule is really so very very bad.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=diziet&amp;ditemid=14666" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
  <entry>
    <id>tag:dreamwidth.org,2009-05-21:377446:515</id>
    <link rel="alternate" type="text/html" href="https://diziet.dreamwidth.org/515.html"/>
    <link rel="self" type="text/xml" href="https://diziet.dreamwidth.org/data/atom/?itemid=515"/>
    <title>git signed commits are a bad idea</title>
    <published>2018-01-29T17:18:46Z</published>
    <updated>2018-01-29T17:42:41Z</updated>
    <category term="git"/>
    <category term="computers"/>
    <dw:security>public</dw:security>
    <dw:reply-count>1</dw:reply-count>
    <content type="html">&lt;h2&gt;Summary&lt;/h2&gt;&lt;br /&gt;Occasionally I hear that people are using git signed commits.  IMO they should probably be doing something else.  I am not aware of any use case for which signed git commits are the right answer.  All uses should be replaced by signed tags, or signed pushes.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Analysis&lt;/h2&gt;&lt;br /&gt;The root cause of the problem is that a commit does not say which branch it is destined for.  The same commit (tree, message, and ancestry) might be a debugging patch - not intended for deployment, even harmful; or it might be unfinished and buggy; or it might be ready for deployment.&lt;br /&gt;&lt;br /&gt;In practical terms, if your system is set up to require every commit to be signed, the expected development model is that developers will configure their git to sign every commit.  (The alternative is to batch-rewrite branches later, to add the signatures, which is an obvious subversion of the intention that signed commits were to provide traceability.)&lt;br /&gt;&lt;br /&gt;If you make your developers sign every commit as they go, they will sign lots and lots of junk.  Test commits, unfinished branches, local experiments, etc. etc.  Whatever way your signatures are supposed to be part of your security assurance, is subvertible by anyone who can find copies of these signed junk branches - and git of course encourages promiscuous sharing of branches, including works in progress.  (Usually that promiscuious sharing is good.)&lt;br /&gt;&lt;br /&gt;Furthermore, developers will have to have their signing key continuously available; so to the extent that that signing key is important for your system's integrity, it is weakened.&lt;br /&gt;&lt;br /&gt;What is really required is that person who approves a push to master should makes a signature declaring what they intend.  It is at that point that a human tells their computer what the destination of a branch is supposed to be - before that, the branch is an unfinished work-in-progress.&lt;br /&gt;&lt;br /&gt;The same is true of other kinds of publication operations where it looks like signed commits might be useful.  For example, submission of a pull request for review:  you want the signature to say what the person making it intends: ie, that this is a pull request, which branch it is targeted for, etc.&lt;br /&gt;&lt;br /&gt;There are two git features that can be used to implement what is really needed:&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Signed pushes&lt;/h2&gt;&lt;br /&gt;git supports attaching a gpg signature to &amp;lsquo;git push&amp;rsquo;.  This would be precisely right for many applications.&lt;br /&gt;&lt;br /&gt;The signature covers precisely what it ought to: it describes the set of refs that should be updated, including their previous values.  So (if you have a workflow involving reviews) the signature shows whether the signer intended &amp;ldquo;this should be reviewed as a submission to master&amp;rdquo; (a request to the reviewer) or &amp;ldquo;I have reviewed and approved this and it should be pushed to master immediately&amp;rdquo; (an instruction to the repository).  You can tell the difference between &amp;ldquo;this is intended for master&amp;rdquo; and &amp;ldquo;this is intended as a stopgap to fix user issue in ticket #NNNN&amp;rdquo;.&lt;br /&gt;&lt;br /&gt;Unfortunately the git server side it is afflicted by a &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852647"&gt;number&lt;/a&gt; &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852684"&gt;of&lt;/a&gt; &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852688"&gt;annoyances&lt;/a&gt;, and a &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852695"&gt;lack of good docs&lt;/a&gt;, which make verification and publication of the push history awkward.&lt;br /&gt;&lt;br /&gt;By default the git server in the main git project does not record push signatures anywhere.  Obviously you would need to store them.  There are some &lt;a href="https://git.eclipse.org/r/#/q/message:cert"&gt;semi&lt;/a&gt;[1] &lt;a href="https://github.com/sitaramc/gitolite/blob/cf062b8bb6b21a52f7c5002d33fbc950762c1aa7/contrib/hooks/repo-specific/save-push-signatures"&gt;standard&lt;/a&gt; approaches to this problem, implemented by some other git server projects.&lt;br /&gt;&lt;br /&gt;[1] Javascript needed to even see this.  Sorry.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Signed tags&lt;/h2&gt;&lt;br /&gt;An obvious alternative feature is signed tags.  A tag has a name; furthermore, it has an associated message.  If necessary the message can carry metadata about the author's intent.  (Commits can carry such metadata too but it is not desirable to rewrite a commit just to change its intended destination.)&lt;br /&gt;&lt;br /&gt;So it is possible to use signed tags to do the job of signed pushes.&lt;br /&gt;&lt;br /&gt;In this model, every push to every controlled branch is associated with a signed tag.  The signed tag's name identifies the branch it is intended for.  The branch update should be accepted if its tag refers to commit which is descended from the current HEAD.  Later verification involves observing that the commit objects referred to by the signed tags form the expected DAG.&lt;br /&gt;&lt;br /&gt;(The &lt;a href="https://manpages.debian.org/stretch-backports/dgit/dgit.1.en.html"&gt;dgit&lt;/a&gt; git repository I run for Debian verifies pushes by looking for particular signed tags.)&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Signing only the merges to master&lt;/h2&gt;&lt;br /&gt;There is one caveat to my assertion that signed commits are useless.  A signed merge commit can sometimes do the job of a signed push.&lt;br /&gt;&lt;br /&gt;In this use model, mainline only advances by merges, and the only signed commits are the merges which bless other branches into mainline.  The rest of the merged branch consists of unsigned commits.&lt;br /&gt;&lt;br /&gt;If your tools support only signed commits, rather than signed pushes (or don't record push signatures), and can't be made to do useful verification of signed tags, maybe you could get them to expect (only) the mainline merges to be signed.&lt;br /&gt;&lt;br /&gt;There would be some subtleties here - for example, because merges can be nontrivial.  I wouldn't recommend this approach without thinking harder about it.&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=diziet&amp;ditemid=515" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
</feed>
