<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
  <title>Marcelo Fernandes</title>
  <link rel="alternate" type="text/html" href="https://marcelofern.com"/>
  <link rel="self" type="application/atom+xml" href="https://marcelofern.com/feed"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
    <name>Marcelo Fernandes</name>
    <email>marceelofernandes@gmail.com</email>
    <uri>https://marcelofern.com</uri>
  </author>
  <id>tag:marcelofern.com,2024:/feed</id>
<entry><title>Profiling a Django Migration in Postgres</title><link href="https://www.marcelofern.com/posts/postgres/profiling_a_django_migration_in_postgres/index.html"/><id>tag:marcelofern.com,2025-02-17:/postgres/profiling_a_django_migration_in_postgres/index.html</id><content type="html">&lt;h1&gt;Profiling a Django Migration in Postgres&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2025-02-17
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this post I want to start from the end. I want to look into the SQL for a
particular schema change, and then verify whether a Django migration that
produces this change is safe to run in production or not.&lt;/p&gt;
&lt;p&gt;Let&#x27;s start with a question:
&lt;em&gt;Is the following schema change safe to run in a production database?&lt;/em&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;ALTER TABLE foo ADD COLUMN bar int NOT NULL DEFAULT 1234;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this hypothetical scenario, &lt;code&gt;foo&lt;/code&gt; is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fairly large (over 100GB).&lt;/li&gt;
&lt;li&gt;Used in anger in production.&lt;/li&gt;
&lt;li&gt;Running on a supported Postgres version (&amp;gt; v12).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without answering the question yet, I want you to consider this other
statement:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;ALTER TABLE foo ADD COLUMN buzz int NOT NULL DEFAULT (random() * 10000)::int;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, have you figured if either (or both) of those are safe to run?&lt;/p&gt;
&lt;p&gt;If not, you might want to start thinking about what Postgres would have to do
in order to have a &lt;code&gt;NOT NULL&lt;/code&gt; column with a &lt;code&gt;DEFAULT&lt;/code&gt; value.&lt;/p&gt;
&lt;p&gt;Would it need to scan the table and store those values in existing rows? What
if the new rows didn&#x27;t fit in the page? Is there a way to do it so that
Postgres doesn&#x27;t need to scan the table?&lt;/p&gt;
&lt;p&gt;One of the worst things that can happen when you perform a schema change is for
it to end up rewriting the table. Rewriting takes time, and while the table is
being rewritten, the DDL statement will be holding an access exclusive lock,
not permitting any other sessions and transactions to read or write to the
table.&lt;/p&gt;
&lt;h2&gt;Supposition: The table is rewritten&lt;/h2&gt;
&lt;p&gt;So starting with the first statement, let&#x27;s investigate whether it rewrites the
table or not. We will first need to get the &lt;code&gt;foo&lt;/code&gt; table up, and populate it.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Create the table
DROP TABLE IF EXISTS foo;
CREATE TABLE foo (id SERIAL PRIMARY KEY);

-- Insert 100_000 rows.
INSERT INTO foo (id) SELECT generate_series(1, 100000);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, we want to know what Postgres is doing internally. For that, we&#x27;ll need
to profile what happens when the &lt;code&gt;ALTER TABLE&lt;/code&gt; command is running.&lt;/p&gt;
&lt;p&gt;Note: As I am writing this post on a Mac, I will use &amp;quot;Instruments&amp;quot; to profile
Postgres, but if you are on a Linux machine you can use &lt;code&gt;perf&lt;/code&gt; instead. I wrote
a guide
&lt;a href=&quot;https://marcelofern.com/notes/databases/postgres/internals/profiling_postgres_on_linux.html&quot;&gt;here&lt;/a&gt;
for the Linux users.&lt;/p&gt;
&lt;p&gt;The first step is to grab the process id of the &lt;code&gt;psql&lt;/code&gt; shell we are going to
use for profiling:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT pg_backend_pid();
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, open the &amp;quot;Time Profiler&amp;quot; tool on Instruments.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;img_time_profiler.png&quot; alt=&quot;img_time_profiler.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;And find the Postgres process. In terms of configuration I mostly use the
defaults. I only change the frequency to &amp;quot;High&amp;quot;, and recording mode to
&amp;quot;Deferred&amp;quot;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;img_time_profiler_config.png&quot; alt=&quot;img_time_profiler_config.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Now we hit &lt;code&gt;RECORD&lt;/code&gt;, and perform these statements on psql:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
ALTER TABLE foo ADD COLUMN buzz int NOT NULL DEFAULT (random() * 1000)::int;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then we hit &lt;code&gt;STOP&lt;/code&gt;. The profiler result would look something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;img_default_random_profile.png&quot; alt=&quot;img_default_random_profile.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;There is a suspicious call to &lt;code&gt;ATRewriteTable&lt;/code&gt;... This is not good!&lt;/p&gt;
&lt;p&gt;Let&#x27;s see what the other alter table with a constant default does. But first,
let&#x27;s rollback that transaction.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;ROLLBACK;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And now let&#x27;s run our Time Profiler and then execute the command:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
ALTER TABLE foo ADD COLUMN bar int NOT NULL DEFAULT 12345;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;img_default_constant_profile.png&quot; alt=&quot;img_default_constant_profile.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Wait a minute... Is this calling &lt;code&gt;ATRewriteTables&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Yes! But this is a false positive... Calling this function doesn&#x27;t mean that it
is actually rewritting the table. Perhaps a better name for that function
should be &lt;code&gt;ATMaybeRewriteTables&lt;/code&gt;? ...&lt;/p&gt;
&lt;p&gt;In any case, if &lt;code&gt;ATRewriteTables&lt;/code&gt; is going to actually do anything, it will
call the &lt;code&gt;ATRewriteTable&lt;/code&gt; (note the singular) function, where the magic
happens.&lt;/p&gt;
&lt;p&gt;But also, scrolling down that function I see the pattern:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;    if (newrel || needscan)
    {
        if (newrel)
            ereport(DEBUG1,
                    (errmsg_internal(&amp;quot;rewriting table \&amp;quot;%s\&amp;quot;&amp;quot;,
                                     RelationGetRelationName(oldrel))));
        else
            ereport(DEBUG1,
                    (errmsg_internal(&amp;quot;verifying table \&amp;quot;%s\&amp;quot;&amp;quot;,
                                     RelationGetRelationName(oldrel))));
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So this means that Postgres writes to the logger when it&#x27;s rewriting or
verifying a table. This configuration can be turned on by:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SET client_min_messages=debug1;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So if we run the SQL statements again, we&#x27;ll see that log message showing up in
the &lt;code&gt;psql&lt;/code&gt; shell:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;

ALTER TABLE foo ADD COLUMN buzz int NOT NULL DEFAULT (random() * 10000)::int;
-- DEBUG:  rewriting table &amp;quot;foo&amp;quot;

-- This one doesn&#x27;t print anything, as the table is not rewritten.
ALTER TABLE foo ADD COLUMN bar int NOT NULL DEFAULT 12345;

ROLLBACK;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The Django Equivalent&lt;/h2&gt;
&lt;p&gt;Say we have the following &amp;quot;dumb&amp;quot; model:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;class Foo(models.Model):
    pass
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&#x27;s add a new integer field with a default:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;class Foo(models.Model):
    bar = models.IntegerField(null=False, default=10)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Django will create the following migration automatically:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-py&quot;&gt;# Generated by Django 5.1.6 on 2025-02-17 05:54

from django.db import migrations, models


class Migration(migrations.Migration):

    dependencies = [
        (&#x27;app&#x27;, &#x27;0001_initial&#x27;),
    ]

    operations = [
        migrations.AddField(
            model_name=&#x27;foo&#x27;,
            name=&#x27;bar&#x27;,
            field=models.IntegerField(default=10),
        ),
    ]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which results in these SQL statements:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
--
-- Add field bar to foo
--
ALTER TABLE &amp;quot;myfoo&amp;quot; ADD COLUMN &amp;quot;bar&amp;quot; integer DEFAULT 10 NOT NULL;
ALTER TABLE &amp;quot;myfoo&amp;quot; ALTER COLUMN &amp;quot;bar&amp;quot; DROP DEFAULT;
COMMIT;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why is Django creating a default and dropping it immediately?
This happens due to the consequences of three considerations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Django allows &lt;code&gt;default&lt;/code&gt; to be a callable:&lt;pre&gt;&lt;code class=&quot;language-py&quot;&gt;def my_default():
    import random
    return random.randint(0, 42)

class Foo(models.Model):
    bar = models.IntegerField(null=False, default=my_default)
&lt;/code&gt;&lt;/pre&gt;
In this case, Django grabs the first value returned by &lt;code&gt;my_default&lt;/code&gt; as the
value to generate the DDL statement. If you run &lt;code&gt;sqlmigrate&lt;/code&gt; multiple
times, it will even generate different outputs!&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- ./manage.py sqlmigrate app 0004

BEGIN;
ALTER TABLE &amp;quot;foo&amp;quot; ADD COLUMN &amp;quot;bar&amp;quot; integer DEFAULT 15 NOT NULL;
ALTER TABLE &amp;quot;foo&amp;quot; ALTER COLUMN &amp;quot;bar&amp;quot; DROP DEFAULT;
COMMIT;

-- ./manage.py sqlmigrate app 0004

BEGIN;
--
-- Add field bar to foo
--
ALTER TABLE &amp;quot;foo&amp;quot; ADD COLUMN &amp;quot;bar&amp;quot; integer DEFAULT 4 NOT NULL;
ALTER TABLE &amp;quot;foo&amp;quot; ALTER COLUMN &amp;quot;bar&amp;quot; DROP DEFAULT;
COMMIT;
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;As a consequence of the above, the callable may contain a very complex logic
that isn&#x27;t able to be reproduced as SQL. This means that Django has to
enforce the default in the application level, not in the database level.&lt;/li&gt;
&lt;li&gt;If the above is true, why have a &lt;code&gt;DEFAULT&lt;/code&gt; then? That&#x27;s because adding a NOT
NULL without a default in an existing table is an error in Postgres:&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;ALTER TABLE foo ADD COLUMN buzz_buzz int NOT NULL;
-- ERROR:  column &amp;quot;buzz_buzz&amp;quot; of relation &amp;quot;foo&amp;quot; contains null values
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We can see these limitations as a consequence of Django&#x27;s design to allow the
&lt;code&gt;default&lt;/code&gt; argument to work with callables.&lt;/p&gt;
&lt;h2&gt;Further Problems&lt;/h2&gt;
&lt;p&gt;If your database can be used by people from &lt;em&gt;outside&lt;/em&gt; the Django application,
the defaults won&#x27;t be honoured. From a data-integrity perspective, it is best
to enforce rules on the database than on the application.&lt;/p&gt;
&lt;h2&gt;A Little Plot Twist&lt;/h2&gt;
&lt;p&gt;Luckily, Django 5.0 now includes the parameter &lt;code&gt;Field.db_default&lt;/code&gt; that allows
the default to be enforced on the database level!&lt;/p&gt;
&lt;p&gt;So you can have this change:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;class Foo(models.Model):
    bar = models.IntegerField(null=False, db_default=10)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which creates these changes:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
--
-- Add field bar to foo
--
ALTER TABLE &amp;quot;myfoo&amp;quot; ADD COLUMN &amp;quot;bar&amp;quot; integer DEFAULT 10 NOT NULL;
COMMIT;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note how the &lt;code&gt;DEFAULT&lt;/code&gt; is not dropped in this case.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2025-02-17T00:00:00Z</published><updated>2025-02-17T00:00:00Z</updated></entry><entry><title>Should you not use Postgres varchar(n) by default?</title><link href="https://www.marcelofern.com/posts/postgres/should_you_not_use_varchar_n/index.html"/><id>tag:marcelofern.com,2025-02-01:/postgres/should_you_not_use_varchar_n/index.html</id><content type="html">&lt;h1&gt;Should you not use Postgres varchar(n) by default?&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2025-02-01
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Postgres wiki has a page called &lt;a href=&quot;https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_varchar.28n.29_by_default&quot;&gt;&amp;quot;Don&#x27;t Do
This&amp;quot;&lt;/a&gt;
where general good practices are discussed.&lt;/p&gt;
&lt;p&gt;Amongst them, there is a session titled: &lt;strong&gt;&amp;quot;Don&#x27;t use varchar(n) by default&amp;quot;&lt;/strong&gt;
which is copied verbatim below:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why not?&lt;/strong&gt;
varchar(n) is a variable width text field that will throw an error if you try
and insert a string longer than n characters (not bytes) into it.&lt;/p&gt;
&lt;p&gt;varchar (without the (n)) or text are similar, but without the length limit.
If you insert the same string into the three field types they will take up
exactly the same amount of space, and you won&#x27;t be able to measure any
difference in performance.&lt;/p&gt;
&lt;p&gt;If what you really need is a text field with an length limit then varchar(n)
is great, but if you pick an arbitrary length and choose varchar(20) for a
surname field you&#x27;re risking production errors in the future when Hubert
Blaine Wolfe­schlegel­stein­hausen­berger­dorff signs up for service.&lt;/p&gt;
&lt;p&gt;Some databases don&#x27;t have a type that can hold arbitrary long text, or if
they do it&#x27;s not as convenient or efficient or well-supported as varchar(n).
Users from those databases will often use something like varchar(255) when
what they really want is text.&lt;/p&gt;
&lt;p&gt;If you need to constrain the value in a field you probably need something
more specific than a maximum length - maybe a minimum length too, or a
limited set of characters - and a check constraint can do all of those things
as well as a maximum string length.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;When should you?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;When you want to, really. If what you want is a text field that will throw an
error if you insert too long a string into it, and you don&#x27;t want to use an
explicit check constraint then varchar(n) is a perfectly good type. Just
don&#x27;t use it automatically without thinking about it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;Also, the varchar type is in the SQL standard, unlike the text type, so it
might be the best choice for writing super-portable applications.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The reasons for using &lt;code&gt;varchar&lt;/code&gt; (without the (n)) are compelling:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;No performance penalties.&lt;/li&gt;
&lt;li&gt;Reduced risk of errors if you misrepresented the size of the data.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;What the wiki doesn&#x27;t do a good job of, is steelmanning the downsides of the
approach it directs towards, namely: &lt;strong&gt;using varchar by default&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Let&#x27;s go over them.&lt;/p&gt;
&lt;h2&gt;Denial-of-Service (DoS) via Uncontrolled Data Insertion&lt;/h2&gt;
&lt;p&gt;If you choose a bare &lt;code&gt;varchar&lt;/code&gt; for your surname field, you&#x27;ll need validation
somewhere in the application to ensure this field doesn&#x27;t become a vector for
attacks.&lt;/p&gt;
&lt;p&gt;If a malicious party finds this free-text field without an upper-limit
validation, they can perform database stuffing by storing enormous volume of
data in the database, ending on a DoS attack.&lt;/p&gt;
&lt;p&gt;Even if the application pre-validates the data before storing it, this level
of validation is much weaker as a guarantee of data integrity than delegating
the job to the database. The database is excellent for data integrity
guarantees, application code is not.&lt;/p&gt;
&lt;p&gt;Most services have hard constraints on such inputs. For example, the below is
the limit for names on Twitter:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;img_twitter_limits.png&quot; alt=&quot;img_twitter_limits.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;Storing free-text in a database may not be a good idea&lt;/h2&gt;
&lt;p&gt;Databases are optimised for structured data. There are better alternatives for
storing free-text like S3, CDNs, or even just a dump static file server.&lt;/p&gt;
&lt;p&gt;Having large free-text fields stored on a table will reduce the performance of
the server, at the minimum you have the overhead of a TOAST table for some
large rows, but also you are slowing down many db maintenance activities and
backup tasks for data that you might not always need to have at hand.&lt;/p&gt;
&lt;h2&gt;Increasing the size of a varchar(n) is not a problem&lt;/h2&gt;
&lt;p&gt;Performing an ALTER TABLE to pump the value of &lt;code&gt;n&lt;/code&gt; up is a catalogue-only
operation and won&#x27;t culminate in a database outage.&lt;/p&gt;
&lt;p&gt;Of course, at that point you might have had a few angry customers complaining
about errors in the application. You have to ponderate if this is worth over
the risks of having a DoS via Uncontrolled Data Insertion attack instead.&lt;/p&gt;
&lt;p&gt;There is a real problem though if you want to &lt;strong&gt;decrease&lt;/strong&gt; the value of &lt;code&gt;n&lt;/code&gt;.
This will rewrite the table:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- This will tell you if a table is being re-written.
SET client_min_messages=debug1;

DROP TABLE IF EXISTS test;

CREATE TABLE test (id SERIAL PRIMARY KEY, str varchar(6));

INSERT INTO test (str) SELECT generate_series(1, 1000);

-- Increasing the value of `n`, no problem here.
ALTER TABLE test ALTER COLUMN str TYPE varchar(7);

-- Also completely removing `n`, no problem!
-- Caveat: this will trigger a potential &amp;quot;building_index&amp;quot; operation for the
-- TOAST table.
ALTER TABLE test ALTER COLUMN str TYPE varchar;

-- Decreasing the value of `n`. This is a risky operation!
ALTER TABLE test ALTER COLUMN str TYPE varchar(4);
-- DEBUG:  rewriting table &amp;quot;test&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I haven&#x27;t seen cases of having to reduce the value of &lt;code&gt;n&lt;/code&gt; before in production.&lt;/p&gt;
&lt;p&gt;But even then, there is a way to set a lower upper bound without downtime
via check constraints:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- [OPTIONAL] you can promote the field to a bare varchar first
ALTER TABLE test ALTER COLUMN str TYPE varchar;

-- Add a NOT VALID constraint, so that it does not scan the table while holding
-- an AccessExclusive Lock.
ALTER TABLE test
ADD CONSTRAINT chk_str_length CHECK (LENGTH(str) &amp;lt;= 4)
NOT VALID;

-- This will only acquire a ShareUpdateExclusiveLock
ALTER TABLE test VALIDATE CONSTRAINT chk_str_length;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that the check constraint performance may be slower than the native
&lt;code&gt;varchar(n)&lt;/code&gt; check due to the function evaluations behind performing a
constraint check.&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Think thoroughly about upper/lower bounds of your data before creating a
field.&lt;/li&gt;
&lt;li&gt;Ponderate between the risks of having length-limit errors versus a potential
DDoS attack surface.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Do not&lt;/strong&gt; reach out for free-text fields by default. Unless you are always
adding a CHECK CONSTRAINT to sanitise input limits.&lt;/li&gt;
&lt;/ul&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2025-02-01T00:00:00Z</published><updated>2025-02-01T00:00:00Z</updated></entry><entry><title>Code Reviews In Vim</title><link href="https://www.marcelofern.com/posts/git/code_reviews_in_vim/index.html"/><id>tag:marcelofern.com,2024-11-21:/git/code_reviews_in_vim/index.html</id><content type="html">&lt;h1&gt;Code Reviews In Vim&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-11-21
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A common way of reviewing code today is by performing the review using the
repository host UI (GitHub, GitLab, etc.).&lt;/p&gt;
&lt;p&gt;I have done that for awhile, and I still do it when code changes are trivial.&lt;/p&gt;
&lt;p&gt;&amp;quot;Trivial&amp;quot; means I don&#x27;t need to play with the branch locally first to have
confidence the changes are correct.&lt;/p&gt;
&lt;p&gt;However, often I will work with code that is hard to &amp;quot;only see&amp;quot; and feel
confident it does the right thing.&lt;/p&gt;
&lt;p&gt;For more complex cases, having the branch locally allows me to inspect and
alter the code better than I can using the repository host UI.&lt;/p&gt;
&lt;h2&gt;Fetching&lt;/h2&gt;
&lt;p&gt;The first step is to get the branch locally:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;git fetch origin branch_name &amp;amp;&amp;amp; git checkout branch_name
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can simplify this command and skip to &lt;code&gt;git checkout&lt;/code&gt; if you always fetch
the entire remote (which I don&#x27;t do because it takes too much space/time).&lt;/p&gt;
&lt;h2&gt;Showing the commits&lt;/h2&gt;
&lt;p&gt;The next step is to find exactly which commits the new branch includes.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;git log -p master..HEAD
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This command basically means &amp;quot;show me all the commits that this branch (HEAD)
introduced since it diverged from master&amp;quot;.&lt;/p&gt;
&lt;p&gt;This effectively shows the commits created by the branch author and nothing
else.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;-p&lt;/code&gt; (patches) includes the diffs for each commit in the result. I skip
this sometimes if I am visualising the patches in a different way.&lt;/p&gt;
&lt;p&gt;When in Vim, I use the vim-fugitive plugin. Running the same command through
the plugin wrapper gives me a quick-fix window containing the commits from the
pull request.&lt;/p&gt;
&lt;p&gt;Running &lt;code&gt;:Gclog master..HEAD&lt;/code&gt; looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;gclog.png&quot; alt=&quot;gclog.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Now I can quickly navigate between commits to see what&#x27;s changed.&lt;/p&gt;
&lt;p&gt;If you are using a different tool but still want to see the commit changes in
your editor, you can try:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;git show &amp;lt;commit_hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Checking Out Each Commit&lt;/h2&gt;
&lt;p&gt;I currently work on codebases that use atomic commits.
This essentially means that for each commit:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The test suite must fully pass.&lt;/li&gt;
&lt;li&gt;The integrity of the codebase isn&#x27;t in jeopardy (no half-done changes
between commits).&lt;/li&gt;
&lt;li&gt;The codebase is in a deployable state.&lt;/li&gt;
&lt;li&gt;A commit explains a single change, not multiple.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For example, a commit title &amp;quot;Change X &lt;strong&gt;and&lt;/strong&gt; Y&amp;quot; is an indication that the
commit isn&#x27;t atomic. Multiple things are changing in the same commit.&lt;/p&gt;
&lt;p&gt;Having atomic commits means that I can code-review a Pull Request
commit-by-commit.&lt;/p&gt;
&lt;p&gt;Of course there are many more benefits of this practice. For example, I can use
&lt;code&gt;git blame&lt;/code&gt; effectively. No change will be part of a 40-commits rebased
branch with lack of detailed explanation in the commit description.&lt;/p&gt;
&lt;p&gt;So after I have run &lt;code&gt;git log -p master..HEAD&lt;/code&gt;, I will go through each commit
and perform:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# Checkout the relevant commit I want to play with
git checkout &amp;lt;commit hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If I have messed things up, I can just check the reflog and go back to the
place where I got the branch from.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# List the &amp;quot;reference logs&amp;quot; to find the record when the tip of the branch
# reference changed.

git reflog

# Now go back to the hash representing the time I checked the branch at the
# first time. It will look something like:
# 43c0eb3 HEAD@{3}: checkout: moving from main to my_branch

git checkout 43c0eb3
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Taking Notes&lt;/h2&gt;
&lt;p&gt;If I am reviewing a complicated branch, I will usually open a new file in the
&lt;code&gt;/tmp/&lt;/code&gt; folder to take some notes in.&lt;/p&gt;
&lt;p&gt;There isn&#x27;t anything fancy about that. It comes from the principle of wanting
to have vim-editing capabilities when writing down a comment on a pull request.&lt;/p&gt;
&lt;p&gt;Often, I will write code blocks in reply to a commit anyway, so editing
comments in vim makes it easier to edit the comment than say, the GitHub UI.&lt;/p&gt;
&lt;p&gt;Note: There are ways to embed vim into a browser nowadays, but it often feels
strange. I prefer to not use an embedded vim.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-11-21T00:00:00Z</published><updated>2024-11-21T00:00:00Z</updated></entry><entry><title>About That Postgres Json Field</title><link href="https://www.marcelofern.com/posts/postgres/about-that-postgres-json-field/index.html"/><id>tag:marcelofern.com,2024-10-10:/postgres/about-that-postgres-json-field/index.html</id><content type="html">&lt;h1&gt;About That Postgres Json Field&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-10-10
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The json type was introduced in Postgres 9.2. Since then, the json type has
gone through multiple enhancements, including the addition of the jsonb type
(9.4).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Because the json type stores an exact copy of the input text, it will
preserve semantically-insignificant white space between tokens, as well as
the order of keys within json objects. Also, if a json object within the
value contains the same key more than once, all the key/value pairs are kept.
(The processing functions consider the last value as the operative one.) By
contrast, jsonb does not preserve white space, does not preserve the order of
object keys, and does not keep duplicate object keys. If duplicate keys are
specified in the input, only the last value is kept.
&lt;a href=&quot;http://web.archive.org/web/20241007081857/https://www.postgresql.org/docs/15/datatype-json.html&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This sounds great, but there are downsides to storing json fields
(specially big ones) in Postgres.&lt;/p&gt;
&lt;p&gt;The degradation of some database functions grows linearly as a function of
the json size.&lt;/p&gt;
&lt;p&gt;To analyse this behaviour, I set up a sandbox script. The script creates &lt;strong&gt;a
table with 500,000 rows&lt;/strong&gt;, &lt;strong&gt;with varying sizes of json fields&lt;/strong&gt; and runs
common operations against it.&lt;/p&gt;
&lt;p&gt;The script and the data are included verbatim at the bottom of this post.&lt;/p&gt;
&lt;h2&gt;VACUUM&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;vacuum.png&quot; alt=&quot;vacuum.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;VACUUM becomes more resource-intensive for tables with large json’s.  Each
dead tuple linked to a large json field adds to the overhead.&lt;/p&gt;
&lt;p&gt;This makes VACUUM run longer. Possibly delaying other maintenance tasks or
DDLs.&lt;/p&gt;
&lt;h2&gt;INSERTS&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;inserts.png&quot; alt=&quot;inserts.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;UPDATES&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;updates.png&quot; alt=&quot;updates.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;SELECT (queries)&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;queries.png&quot; alt=&quot;queries.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Big json fields can degrade query performance. Tables with such fields
take longer to read from disk and use more storage when cached in memory.&lt;/p&gt;
&lt;p&gt;This makes querying these tables less efficient. SELECTs have to do more work
to scan the same amount of data.&lt;/p&gt;
&lt;p&gt;If the json field doesn’t exceed a particular threshold (2KB default but can be
configurable on a table-per-table basis with &lt;code&gt;CREATE TABLE ... WITH (toast_tuple_target=128)&lt;/code&gt;) it won’t be stored in a toast table. Instead, it
will be stored inline on the page.&lt;/p&gt;
&lt;p&gt;If a table doesn&#x27;t have a high ratio of HOT updates, the volume of dead
rows will increase. This further affects performance.&lt;/p&gt;
&lt;h2&gt;Considerations&lt;/h2&gt;
&lt;p&gt;Consider omitting the json field from your SELECT queries when applicable. This
will remove the overhead added by querying and decoding the data from the TOAST
table.&lt;/p&gt;
&lt;p&gt;If you don&#x27;t update the json data after it has been stored, consider a
different type of storage. One option is storing the data on a bucket like S3.&lt;/p&gt;
&lt;p&gt;Buckets can be a good option if you are satisfied with the following
trade-offs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;| Factor                   | Keep json in PostgreSQL                  | Move json to S3                              |
|--------------------------|------------------------------------------|----------------------------------------------|
| Database size            | Increases database size                  | Keeps DB lean; reduces size                  |
| Performance              | json querying is slower for large fields | Keeps queries fast; json retrieved separately|
| Cost                     | Higher storage costs                     | Cheaper for large; unstructured data         |
| Atomicity &amp;amp; Transactions | Full transactional consistency           | No transactional guarantees                  |
| Querying                 | Direct SQL querying on json              | No direct querying                           |
| Simplicity               | All data in one place                    | Separate management of S3 and DB             |
| Access Latency           | Low-latency access                       | Potential latency in fetching from S3        |
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Inspecting the TOAST table&lt;/h2&gt;
&lt;p&gt;You can find the name of a toast table with this query:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    c.relname AS main_table,
    t.relname AS toast_table
FROM
    pg_class c
JOIN
    pg_class t ON c.reltoastrelid = t.oid
WHERE
    c.relname = &#x27;my_table&#x27;;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After you find the name of that toast_table, you can query the &lt;code&gt;pg_toast&lt;/code&gt;
schema to find stats about the toast table (assuming the toast table name is
pg_toast_4532686):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT *
FROM pg_stat_all_tables
WHERE relid = &#x27;pg_toast.pg_toast_24683&#x27;::regclass;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also see the size of each id and its associated chunks:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    chunk_id,
    COUNT(*) as chunks,
    pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484
GROUP BY 1 ORDER BY 1;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And the size of the toast table in comparison to the table itself.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    c1.relname,
    pg_size_pretty(pg_relation_size(c1.relname::regclass)) AS size,
    c2.relname AS toast_relname,
    pg_size_pretty(pg_relation_size((&#x27;pg_toast.&#x27; || c2.relname)::regclass)) AS toast_size
FROM
    pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
    c1.relkind = &#x27;r&#x27;
    AND c1.relname = &#x27;table_name&#x27;;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And you can query its rows as a regular table:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT *
FROM pg_toast.pg_toast_24683 LIMIT 100;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For more on toast follow these links:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://hakibenita.com/sql-medium-text-performance&quot;&gt;blog post hakibenita&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;The benchmark script&lt;/h2&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;import psycopg2
import time
import os
import json
import random
import string

&amp;quot;&amp;quot;&amp;quot;
This code creates a benchmark for Postgres tables with
json fields.

For a table with NUM_OF_ROWS, for each value of JSON_SIZE_IN_BYTES:

- Check how long it takes to update PERCENTAGE rows in the table.
- Check how long it takes to insert PERCENTAGE rows in the table.
- Without vacuuming yet, check the average time to query NUM_QUERIES. There
  will be dead tuples impacting the performance from the operations above.
- Check how long it takes to vacuum.

&amp;quot;&amp;quot;&amp;quot;

NUM_OF_ROWS = 500_000

JSON_SIZE_IN_BYTES = [
    10,
    100,
    200,
    500,
    1000,
    3_000,
    5_000,
    10_000,
    15_000,
    20_000,
    40_000,
]
PERCENTAGE = 0.1

NUM_QUERIES = 10_000
QUERY_LIMIT = 500


def get_cursor_and_connection():
    # Update connection details as per your PostgreSQL setup
    conn = psycopg2.connect(
        dbname=&amp;quot;test_db&amp;quot;,
        user=&amp;quot;postgres&amp;quot;,
        password=&amp;quot;postgres&amp;quot;,
        host=&amp;quot;localhost&amp;quot;,
        port=&amp;quot;5441&amp;quot;,
    )
    conn.autocommit = True
    return conn.cursor(), conn


def vacuum_table(cursor):
    print(&amp;quot;- Vacuuming json_bench...&amp;quot;)
    start_time = time.time()
    cursor.execute(&amp;quot;VACUUM ANALYZE json_bench;&amp;quot;)
    duration = time.time() - start_time
    print(f&amp;quot;- Vacuum took: {duration:.2f} seconds.&amp;quot;)
    return duration


def create_table(cursor):
    print(&amp;quot;- Creating table...&amp;quot;)
    cursor.execute(&amp;quot;&amp;quot;&amp;quot;
        -- Idempotency for convenience.
        DROP TABLE IF EXISTS json_bench;
        CREATE TABLE json_bench (
            id SERIAL PRIMARY KEY,
            json_field JSONB
        );

        -- Disable autovacuum to not interfere with results.
        ALTER TABLE json_bench
        SET (autovacuum_enabled = false);

        -- Make sure the json field will be toasted at 2kb
        -- and compressed too.
        ALTER TABLE json_bench
        ALTER COLUMN json_field
        SET STORAGE EXTENDED;

        -- The default toast threshold is 2kb (comp time)
        -- #define TOAST_TUPLE_THRESHOLD 2048
    &amp;quot;&amp;quot;&amp;quot;)


def generate_json(json_size_in_bytes):
    return json.dumps(
        {
            &amp;quot;data&amp;quot;: &amp;quot;&amp;quot;.join(
                random.choices(
                    string.ascii_letters + string.digits, k=json_size_in_bytes
                )
            )
        }
    )


def populate_table(cursor, json_size_in_bytes):
    print(f&amp;quot;- Populating table with {NUM_OF_ROWS:,} rows...&amp;quot;)

    json_data = generate_json(json_size_in_bytes)
    json_rows = [(json_data,) for _ in range(NUM_OF_ROWS)]

    start_time = time.time()
    cursor.executemany(&amp;quot;INSERT INTO json_bench (json_field) VALUES (%s);&amp;quot;, json_rows)
    duration = time.time() - start_time

    print(f&amp;quot;- Populating took: {duration:.2f} seconds.&amp;quot;)
    return duration


def update_data(cursor, json_size_in_bytes):
    &amp;quot;&amp;quot;&amp;quot;
    This will generate some dead rows.
    &amp;quot;&amp;quot;&amp;quot;
    update_count = int(PERCENTAGE * NUM_OF_ROWS)
    print(f&amp;quot;- Updating {update_count:,} rows...&amp;quot;)

    json_data = generate_json(json_size_in_bytes)

    update_query = &amp;quot;&amp;quot;&amp;quot;
    UPDATE json_bench
    SET json_field = %s
    WHERE id IN (
        SELECT id
        FROM json_bench
        ORDER BY id DESC
        LIMIT %s
    );
    &amp;quot;&amp;quot;&amp;quot;

    start_time = time.time()
    cursor.execute(update_query, (json_data, update_count))
    duration = time.time() - start_time

    print(f&amp;quot;- Update took: {duration:.2f} seconds.&amp;quot;)
    return duration


def insert_data(cursor, json_size_in_bytes):
    insertion_count = int(PERCENTAGE * NUM_OF_ROWS)
    print(f&amp;quot;- Inserting {insertion_count:,} rows into the table...&amp;quot;)

    json_data = generate_json(json_size_in_bytes)
    json_rows = [(json_data,) for _ in range(insertion_count)]

    start_time = time.time()
    cursor.executemany(&amp;quot;INSERT INTO json_bench (json_field) VALUES (%s);&amp;quot;, json_rows)
    duration = time.time() - start_time

    print(f&amp;quot;- Insertion took: {duration:.2f} seconds.&amp;quot;)
    return duration


def benchmark_queries(cursor):
    print(f&amp;quot;- Bench marking {NUM_QUERIES:,} queries against the table...&amp;quot;)
    start_time = time.time()

    for _ in range(NUM_QUERIES):
        cursor.execute(
            f&amp;quot;SELECT * FROM json_bench ORDER BY RANDOM() LIMIT {QUERY_LIMIT};&amp;quot;
        )
        cursor.fetchall()

    duration = time.time() - start_time
    print(f&amp;quot;- Average query time: {(duration/NUM_QUERIES):.5f} seconds.&amp;quot;)
    return duration / NUM_QUERIES


def clear_cache():
    print(&amp;quot;- Clearing cache by restarting docker container...&amp;quot;)
    os.system(&amp;quot;docker restart postgres15&amp;quot;)
    print(&amp;quot;- sleeping for 10s&amp;quot;)
    time.sleep(10)


def query_dead_tuples(cursor):
    cursor.execute(
        &amp;quot;SELECT n_dead_tup FROM pg_stat_user_tables WHERE relname = &#x27;json_bench&#x27;;&amp;quot;
    )
    print(f&amp;quot;- There are {cursor.fetchone()[0]} dead tuples...&amp;quot;)


def query_hot_updates(cursor):
    cursor.execute(
        &amp;quot;SELECT n_dead_tup FROM pg_stat_user_tables WHERE relname = &#x27;json_bench&#x27;;&amp;quot;
    )
    cursor.execute(
        &amp;quot;&amp;quot;&amp;quot;
        SELECT n_tup_hot_upd
        FROM pg_stat_user_tables
        WHERE relname = &#x27;json_bench&#x27;;
        &amp;quot;&amp;quot;&amp;quot;
    )
    print(f&amp;quot;- There were {cursor.fetchone()[0]} hot updates...&amp;quot;)


def run_tests():
    results = {}
    print(
        f&amp;quot;\nReport details:\n&amp;quot;
        f&amp;quot;  - rows in the table: {NUM_OF_ROWS:,}\n&amp;quot;
        f&amp;quot;  - percentage of updates and inserts: {PERCENTAGE*100:.2f}%\n&amp;quot;
        f&amp;quot;  - number of queries to benchmark: {NUM_QUERIES:,} with limit {QUERY_LIMIT:,}&amp;quot;
    )

    for json_size in JSON_SIZE_IN_BYTES:
        print(f&amp;quot;\nRunning tests with json&#x27;s of {json_size:,} bytes...&amp;quot;)

        # Get a cursor to run queries.
        cursor, conn = get_cursor_and_connection()

        # Create the table and insert data.
        create_table(cursor)
        populate_table(cursor, json_size)

        # Vacuum so there are no dead rows.
        vacuum_table(cursor)

        # Update PERCENTAGE new rows to create some dead rows
        # for a more realistic scenario.
        update_duration = update_data(cursor, json_size)
        query_dead_tuples(cursor)
        query_hot_updates(cursor)

        # Insert PERCENTAGE new rows while we have dead rows
        insert_duration = insert_data(cursor, json_size)
        query_dead_tuples(cursor)
        query_hot_updates(cursor)

        # With the dead rows, check the query performance
        query_avg_duration = benchmark_queries(cursor)

        # Check how long vacuuming the dead rows takes
        vacuum_duration = vacuum_table(cursor)

        # Clear OS cache and get a new cursor
        clear_cache()
        cursor, conn = get_cursor_and_connection()

        results[json_size] = {
            &amp;quot;update_duration&amp;quot;: update_duration,
            &amp;quot;insert_duration&amp;quot;: insert_duration,
            &amp;quot;query_avg_duration&amp;quot;: query_avg_duration,
            &amp;quot;vacuum_duration&amp;quot;: vacuum_duration,
        }

    cursor.close()
    conn.close()

    # Print final results
    print(&amp;quot;\nFinal Results:&amp;quot;)
    for json_size, times in results.items():
        print(f&amp;quot;json size: {json_size:,} bytes&amp;quot;)
        print(f&amp;quot; - Update duration: {times[&#x27;update_duration&#x27;]:.2f} seconds&amp;quot;)
        print(f&amp;quot; - Insert duration: {times[&#x27;insert_duration&#x27;]:.2f} seconds&amp;quot;)
        print(f&amp;quot; - Query AVG duration: {times[&#x27;query_avg_duration&#x27;]:.5f} seconds&amp;quot;)
        print(f&amp;quot; - Vacuum duration: {times[&#x27;vacuum_duration&#x27;]:.2f} seconds&amp;quot;)


run_tests()
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The results in CSV format&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;json_length,update_time,insert_time,avg_query_time,vacuum_time
10,0.22,41.2,0.0689,0.08
100,0.22,40.29,0.07287,0.1
200,0.24,40.73,0.07694,0.16
500,0.35,42.35,0.12824,0.2
1000,0.51,46.06,0.13995,0.36
3000,1.65,71.18,0.10875,2.4
5000,2.17,59.36,0.10136,3.57
10000,9.67,62.27,0.15999,10.67
15000,8.77,91.57,0.16642,16.69
20000,12.7,111.92,0.22364,20.71
40000,28.16,184.93,0.32687,35.98
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The machine that ran the tests&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;[~] neofetch
                   -`                    x@archlinux
                  .o+`                   -----------
                 `ooo/                   OS: Arch Linux x86_64
                `+oooo:                  Host: 20W0005AAU ThinkPad T14 Gen 2i
               `+oooooo:                 Kernel: 6.6.52-1-lts
               -+oooooo+:                Uptime: 3 days, 8 hours, 27 mins
             `/:-:++oooo+:               Packages: 1273 (pacman)
            `/++++/+++++++:              Shell: bash 5.2.37
           `/++++++++++++++:             Resolution: 1920x1080, 1920x1080
          `/+++ooooooooooooo/`           WM: i3
         ./ooosssso++osssssso+`          Theme: Adwaita [GTK2/3]
        .oossssso-````/ossssss+`         Icons: Adwaita [GTK2/3]
       -osssssso.      :ssssssso.        Terminal: alacritty
      :osssssss/        osssso+++.       Terminal Font: LiterationMono Nerd Font
     /ossssssss/        +ssssooo/-       CPU: 11th Gen Intel i5-1135G7 (8) @ 4.200GHz
   `/ossssso+/:-        -:/+osssso+-     GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics]
  `+sso+:-`                 `.-/+oso:    Memory: 6629MiB / 15717MiB
 `++:.                           `-/+/
&lt;/code&gt;&lt;/pre&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-10-10T00:00:00Z</published><updated>2024-10-10T00:00:00Z</updated></entry><entry><title>Thoughts on 3 years of management</title><link href="https://www.marcelofern.com/posts/management/thoughts-on-management/index.html"/><id>tag:marcelofern.com,2024-10-08:/management/thoughts-on-management/index.html</id><content type="html">&lt;h1&gt;Thoughts on 3 years of management&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-10-08
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I&#x27;ve been managing for the past 3 years at my current job. This is not
extensive experience and I still see myself as a junior manager.&lt;/p&gt;
&lt;p&gt;As I reflect on my journey, I talk through lessons learnt from managing
other developers and on recurring patterns I have observed.&lt;/p&gt;
&lt;p&gt;Now, a post about management is not cringe enough unless it includes an &amp;quot;advice
top list&amp;quot;.&lt;/p&gt;
&lt;p&gt;The one I could come up with is made of 5 &lt;strong&gt;must&#x27;s&lt;/strong&gt;.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;You must facilitate.&lt;/li&gt;
&lt;li&gt;You must take active interest in the development of your reports.&lt;/li&gt;
&lt;li&gt;You must reassure through &lt;em&gt;real&lt;/em&gt; recognition.&lt;/li&gt;
&lt;li&gt;You must listen.&lt;/li&gt;
&lt;li&gt;You must keep an open mind about your own management skills.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note: Having innate charisma would greatly help. It&#x27;d make easier to perform
most points above. Sadly the nerdy type often lacks charisma. We need
to work harder on it. More on that later.&lt;/p&gt;
&lt;h2&gt;You Must Facilitate&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;“The manager’s function is not to make people work, but to make it possible
for people to work.” (Peopleware)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Being promoted to manager after performing well as a developer is a regular
occurrence in many companies. Another regular occurrence is for a former
developer to not be good at managing their peers.&lt;/p&gt;
&lt;p&gt;Promoting a good developer has many positives. It ensures that the new
manager won&#x27;t be a mere conduit for communication. Having experience in the
field helps facilitate technical discussions.&lt;/p&gt;
&lt;p&gt;However, The Venn diagram intersection between skills required to be a good
developer and to be a good manager is narrow. Facilitation skills do not
regularly feature on the good-developer skill set as much as they should.&lt;/p&gt;
&lt;p&gt;I did not have guidance and mentorship once I became a manager. On top of that,
I wanted to keep coding at the same pace as before. This situation made it
harder for me to manage my own conflicts of interest between coding and
managing.&lt;/p&gt;
&lt;p&gt;That meant that I wasn&#x27;t facilitating much. The lack of leadership and
direction quickly hampered the development of one of my teams.&lt;/p&gt;
&lt;p&gt;Once that team grew to about ~7 developers the situation became unsustainable.&lt;/p&gt;
&lt;p&gt;I wasn&#x27;t having quality time to spend on important coding tasks because I was
constantly on management duties. At the same time I wasn&#x27;t helping to unblock
my team&#x27;s work as well as I could since I spent considerable time coding.&lt;/p&gt;
&lt;p&gt;By sheer luck, the team was independent enough to perform well without strong
management guidance. That could equally have been the other way around.&lt;/p&gt;
&lt;p&gt;I had to step back and rethink my approach so that I could balance facilitation
duties against coding responsibilities.&lt;/p&gt;
&lt;p&gt;Being &lt;strong&gt;proactive&lt;/strong&gt; about facilitation helped save time for myself and to
relieve stress from my team. It sounds cliché, but finding problems before they
happen and answering the questions before they are asked are important to help
reduce stress across your team.&lt;/p&gt;
&lt;p&gt;The most important thing was realising that unblocking 7 people and letting
them do good work was more important than me, as a single contributor, writing
some of the code.&lt;/p&gt;
&lt;p&gt;This didn&#x27;t make my work easier, though. There are many ways to facilitate
progress, and some situations are harder than others.&lt;/p&gt;
&lt;p&gt;Technical facilitation is usually straight forward. For example, the product
manager asks for a feature but the details aren&#x27;t clear enough for a developer
to jump on the task. You pop a meeting with the client and the P.M. Together
you clear up requirements so that a developer is not stuck with vague
requirements and lack of direction.&lt;/p&gt;
&lt;p&gt;The goal isn&#x27;t to detail the design for the solution (the developer will do
that). Instead, the goal is to make sure requirements are understood and there
is a definition of what &amp;quot;done&amp;quot; means for that task.&lt;/p&gt;
&lt;p&gt;Technical facilitation only takes time and organising. I.e., getting the
relevant people together and taking the time to write up the details of what
was discussed and agreed.&lt;/p&gt;
&lt;p&gt;My experience as a developer helped greatly in these areas. I was already used
to going to client meetings and explaining what was possible versus what was
not possible. I also frequently helped clients with estimations backed up
by my knowledge of the tech and current architecture of the project.&lt;/p&gt;
&lt;p&gt;The hard type of facilitation is the human one. For example, when a team member
is having a hard time because of external factors or conflicts between
colleagues. This is the real human-factor part of the job. I have little
experience on this.&lt;/p&gt;
&lt;p&gt;These situations take a lot of time and energy to solve. Each situation of this
type is different and facilitation might have to be defined on a case-by-case
basis.&lt;/p&gt;
&lt;p&gt;The tools available by the business will be relevant here and the manager needs
to be aware of them to be able to put them to good use. E.g., unlimited leave,
team rotation, mental-health days, external mediation, etc.&lt;/p&gt;
&lt;p&gt;My general observation is to treat the situation with compassion and empathy.
In the end of the day a manager will be dealing with people-problems
frequently. In the eyes of the person being troubled, their problems will be
be more concerning than they may be comfortable telling you. You are at the
risk of miss-characterising the problem if you don&#x27;t take it seriously.&lt;/p&gt;
&lt;p&gt;There is little point in trying to play a hard hand as an authoritarian
manager that only dictates solutions without taking the time to care and
understand the situation. &lt;em&gt;&amp;quot;It&#x27;s your job, just do it!&amp;quot;&lt;/em&gt;. This almost
always goes wrong and makes the manager lose respect even from the people not
involved in the situation. Luckily this is something I learned from observation
and not from direct experience.&lt;/p&gt;
&lt;h2&gt;You Must Take An Active Interest In The Development Of Your Reports&lt;/h2&gt;
&lt;p&gt;Of all the &amp;quot;musts&amp;quot;, this is the one that took me the longest to think and write
about. It&#x27;s hard to find a recipe for what &amp;quot;taking an active interest&amp;quot; is, even
though most people have a good intuition about what this phrase means.&lt;/p&gt;
&lt;p&gt;Different people need different things depending on where they are at their
careers, what they want to achieve, and who they are.&lt;/p&gt;
&lt;p&gt;The first step is understanding what each report needs based on their career
stage, goals, and personality. But there is a meta step here too. Both you and
the report need to be willing to learn and improve together for that
relationship to work and those questions to be answered. It&#x27;s not just about
the report&#x27;s growth but also the manager&#x27;s willingness to evolve in response to
their team&#x27;s needs.&lt;/p&gt;
&lt;p&gt;In the best manager-managed relationships I have had, &lt;strong&gt;both&lt;/strong&gt; people were
learning together what they needed from each other and what they could provide.&lt;/p&gt;
&lt;p&gt;To use an analogy from pedagogy theory:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Through dialogue, the teacher-of-the-students and the students-of-the-teacher
cease to exist and a new term emerges: teacher-student with
students-teachers. The teacher is no longer merely the-one-who-teaches, but
one who is himself taught in dialogue with the students, who in turn while
being taught also teach. They become jointly responsible for a process in
which all grow − Paulo Freire.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Having a manager that allows themselves to not know but are comfortable asking
dumb questions is a &lt;strong&gt;great thing&lt;/strong&gt;. I had managers on the other side of the
spectrum who pretended to be acquainted with things they had no clue about. I
believe they were insecure about showing potential shortcomings to their
reports. This strikes me as a &lt;strong&gt;bad thing&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;It is difficult to provide quality help to develop a report&#x27;s career if the
manager is not an &lt;strong&gt;active&lt;/strong&gt; member of the team in some capacity.&lt;/p&gt;
&lt;p&gt;It is hard to relate to the work of a report if you have no skin in the game.&lt;/p&gt;
&lt;p&gt;This is why I am in favour of managers that get their hands dirty on the
factory line at least some of the time, even if only on smaller tasks.&lt;/p&gt;
&lt;p&gt;This helps building team spirit and camaraderie. Frankly, you can provide a
much richer feedback as a manager if you are close to the work yourself.&lt;/p&gt;
&lt;p&gt;That is not to say that there aren&#x27;t exceptions where the manager cannot be
part of the team. For example, a project may be so novel that no one except the
few reports deep in the weeds can actively contribute to the advancement of
that project.&lt;/p&gt;
&lt;p&gt;This problem also seem to occur the higher up the management chain you go. It
must be difficult for the manager-of-managers to stay on top of each manager&#x27;s
team work.&lt;/p&gt;
&lt;p&gt;However, this does not mean that the manager shouldn&#x27;t try their best to
understand the project. Even if it is to support the team when collateral
damage happens.&lt;/p&gt;
&lt;p&gt;It is hard to take an active interest in someone without getting to know them.
Make sure to invest in the relationship early as it takes time to build trust.&lt;/p&gt;
&lt;p&gt;This is not necessarily popular advice. The book Peopleware makes this
observation below.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;managers are usually not part of the teams that they manage. Teams are made
up of peers, equals that function as equals. The manager is most often
outside the team, giving occasional direction from above and clearing away
administrative and procedural obstacles. By definition, the manager is not a
peer and so can’t be part of the peer group. (Peopleware).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I recommend to not take the hierarchical manager as an
example. It may work in certain business areas, but I haven&#x27;t seen it working
well in software development so far.&lt;/p&gt;
&lt;h2&gt;You Must Listen&lt;/h2&gt;
&lt;p&gt;&amp;quot;Listening is an active skill&amp;quot;. In this case not only one of &amp;quot;paying attention&amp;quot;
but taking proactive effort to reflect on what has been reported and to act on
it.&lt;/p&gt;
&lt;p&gt;A good way of creating space for listening is through recurring one-on-one
meetings. Those are the minimum to keep a relationship flowing. Otherwise,
without the space to exchange ideas, get feedback, or otherwise just rant,
there&#x27;ll be a barrier between manager and report that might not serve either.&lt;/p&gt;
&lt;p&gt;It is easy to get caught on the &amp;quot;busyness-of-it&amp;quot; and put off 1:1&#x27;s - both as a
report or as a manager. If this becomes a frequent occurrence it might be
because of a potential underlying problem.&lt;/p&gt;
&lt;p&gt;Here is a small list of tips for active listening. It serves well just to be
aware of these things before you jump on a 1:1 with a report:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Never interrupt. It frustrates your report and affects the full understanding
of the message. If you have international people in your team, your way of
communication might differ from theirs. Some cultures tend to provide &lt;a href=&quot;https://en.wikipedia.org/wiki/High-context_and_low-context_cultures&quot;&gt;low
context communication whereas others provide high
context&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Defer judgement. Remain open and neutral. If your report is telling you what
problems they are going through, it is unhelpful to start expressing your
personal opinions on the problem without being asked to or without taking
into consideration social cues for when to do it.&lt;/li&gt;
&lt;li&gt;Avoid distractions. Please, just close Slack.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;You Must Reassure Through Real Recognition&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;“People who feel untrusted have little inclination to bond together into a
cooperative team.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A pat on the back goes a long way, specially when it&#x27;s done publicly. But
reassuring is not just about rewarding but building trust and cooperation.&lt;/p&gt;
&lt;p&gt;Everyone likes to feel like they&#x27;re winning and getting validation for their
good work. Many software development managers are introverts who struggle with
giving compliments.&lt;/p&gt;
&lt;p&gt;I think that more than a skill that can be learned, you need to be constantly
aware of opportunities to provide recognitions (this is harder than it sounds).&lt;/p&gt;
&lt;p&gt;I have worked for companies that had free-pizza events (or insert another snack
here) to celebrate team achievements. There&#x27;s nothing wrong with that.&lt;/p&gt;
&lt;p&gt;Celebration is necessary for a team to function. However, some organisations
tend to over-characterise such acts as proof of their generosity and
reassurance that employees are doing a good job.&lt;/p&gt;
&lt;p&gt;I think that most employees can see through this over characterisation and that
leaves them with a bitter taste in their mouths. Without proper reassurance and
&lt;strong&gt;true&lt;/strong&gt; recognition, it doesn&#x27;t matter how many pizza events there are, people
won&#x27;t feel reassured and valued in their job.&lt;/p&gt;
&lt;p&gt;It might even be the opposite: &amp;quot;why is the company going a long way with such
events whereas people aren&#x27;t getting paid enough?&amp;quot;. &lt;a href=&quot;https://en.wikipedia.org/wiki/Bread_and_circuses&quot;&gt;panem et
circenses&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;One of the best kinds of recognition is the financial one, specially when that
financial acknowledgement is made proactively from the manager-side before
&amp;quot;official&amp;quot; raise ceremonies take place.&lt;/p&gt;
&lt;p&gt;It might be harder to give financial recognition in start-up companies
struggling for cash. There are other means to account for that like granting
share options or title promotions.&lt;/p&gt;
&lt;p&gt;However, navigate the &amp;quot;title promotion&amp;quot; situation with care. Although such
promotions are a good way to recognise the work that someone has done, they can
also be a problem. Eagerly promoting people who aren&#x27;t ready for the role as a
retention strategy can backfire. This is &lt;strong&gt;not&lt;/strong&gt; true recognition.&lt;/p&gt;
&lt;p&gt;It is incredible that in certain places salary raises only happen once a
year. If you missed the date but got a lot of responsibility on your back,
you need to perform the role for a year without financial recognition.&lt;/p&gt;
&lt;p&gt;It is very hard to be a manager in such companies as you don&#x27;t have freedom to
&lt;em&gt;really&lt;/em&gt; manage your team.&lt;/p&gt;
&lt;h2&gt;You must keep an open mind about your management skills&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;The same divisive effect occurs in connection with the so-called “leadership
training courses,” which are (although carried out without any such intention
by many of their organizers) in the last analysis alienating. These courses
are based on the naïve assumption that one can promote the community by
training its leaders—as if it were the parts that promote the whole and not
the whole which, in being promoted, promotes the parts.&lt;/p&gt;
&lt;p&gt;Paulo Freire.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Managing is hard, and there is so much material out there that isn&#x27;t necessary
relevant or particularly useful to &lt;strong&gt;your&lt;/strong&gt; situation.&lt;/p&gt;
&lt;p&gt;From that, it is easy to grow cynical thinking that &amp;quot;no one can teach
management&amp;quot;. But that is also not a helpful way to see things. I bet your
reports would raise an eyebrow if you said that to them.&lt;/p&gt;
&lt;p&gt;Material on how to manage &lt;strong&gt;creative workers&lt;/strong&gt; like software developers seems
particularly scarce. The creative-types also seem to require a different
kind of management than the general &amp;quot;KPI-based&amp;quot; literature teaches.&lt;/p&gt;
&lt;p&gt;To make matters worse, there is usually not many clear metrics you can track on
how much impact a creative has made (usually in terms of revenue) to the
business. Of course you should be able to tell whether someone is performing to
the level of their role, but overall impact is a harder thing to measure.&lt;/p&gt;
&lt;p&gt;For example, you might have productive developers that look good on paper but
aren&#x27;t generating tons of concrete value. They may, at the same time, demand a
lot of resources from across the team for code review.&lt;/p&gt;
&lt;p&gt;In the same way, you might have workers that seem slower but always provide
high-quality and impactful changes (and feedback) that are aligned with
core-business values.&lt;/p&gt;
&lt;p&gt;These situations are tricky. Part of growing as a manager is recognising those
types of workers exist and providing feedback that enables them both to grow.&lt;/p&gt;
&lt;p&gt;Even though improving-as-a-manager is a slow process and difficult in a
different way than improving-as-a-developer, progressive improvement is
possible.&lt;/p&gt;
&lt;p&gt;Managing takes a different source of energy than coding. If you are not a real
people person you have to be prepared to spend more time and energy being a
manager.&lt;/p&gt;
&lt;p&gt;Quite frankly, it&#x27;s already very hard to keep on top of these &amp;quot;5 musts&amp;quot;,
specially as the number of reports goes up.&lt;/p&gt;
&lt;p&gt;The final advice here is to keep an open mind. Check the literature and try and
read some books on management. Observe how your team members interact with
each other and listen to what they have to say. Learn from teams that perform
well, and from teams that don&#x27;t perform well. What are the differences? Read
other posts like this one about what other managers are thinking and think
critically about what you read. Don&#x27;t take our word for it!&lt;/p&gt;
&lt;p&gt;Some recommendations for reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The Culture Map (Erin Meyer)&lt;/li&gt;
&lt;li&gt;Peopleware (Tom DeMarco)&lt;/li&gt;
&lt;li&gt;Ruined by Design (Mike Monteiro)&lt;/li&gt;
&lt;li&gt;The Manager&#x27;s Path (Camille Fournier)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Closing Remarks&lt;/h2&gt;
&lt;p&gt;Even after writing about all of these topics, I myself am not able to perform
these advices to the dot every single day.&lt;/p&gt;
&lt;p&gt;If not by an act of human fallibility, in some situations it is just not
possible to follow general advice. I think that this is OK.&lt;/p&gt;
&lt;p&gt;What matters the most is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Constantly learning and taking active interested in how to be better at
management.&lt;/li&gt;
&lt;li&gt;Checking whether you&#x27;d like someone to manage you the way you manage your
team.&lt;/li&gt;
&lt;li&gt;Critically thinking about management and leadership decisions.&lt;/li&gt;
&lt;li&gt;Keeping track of progress!&lt;/li&gt;
&lt;/ul&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-10-08T00:00:00Z</published><updated>2024-10-08T00:00:00Z</updated></entry><entry><title>On Git Commit Messages</title><link href="https://www.marcelofern.com/posts/git/on-commit-messages/index.html"/><id>tag:marcelofern.com,2024-10-05:/git/on-commit-messages/index.html</id><content type="html">&lt;h1&gt;On Git Commit Messages&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-10-05
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are two web pages that provide a great summary on Git Commits best
practice.&lt;/p&gt;
&lt;p&gt;I recommend reading them before going through the rest of this post:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://web.archive.org/web/20241003132241/https://wiki.openstack.org/wiki/GitCommitMessages&quot;&gt;Git Commit Good Practice - OpenStack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://web.archive.org/web/20240930112522/https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html&quot;&gt;Tim Pope&#x27;s note about commit messages&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Those articles are &amp;quot;old&amp;quot;. One is from 2008 and the other is from 2014. This
means that the following occasional comment pops up now and then:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;It&#x27;s 2024, do we &lt;strong&gt;really&lt;/strong&gt; have to restrict ourselves to 72-long commit
titles? I think it&#x27;s acceptable to simply:&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre&gt;&lt;code&gt;git -m &amp;quot;ISSUE:1234X Add date of birth, salary, ethnicity, pronouns, height (in centimetres), salary text fields, and more, to the request loan submit form for Chameleon MVP.&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;We also have so many good tools around git that make it easier to see the
changes and the diffs. We can also link to rich context on JIRA and Asana,
why are we focusing so much on terminal limitations?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;We are agile and always fix forward. We never have use for the old bits of
git like git bisect, git revert, or even git log... What do these do
again?!...&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Given that I have little to contribute to the excellent content in the
articles above, I&#x27;ll limit my contribution to talking about the reason these
posts have aged so well.&lt;/p&gt;
&lt;h2&gt;A Disclaimer&lt;/h2&gt;
&lt;p&gt;Learning a tool like git takes time. To make the most of git, one needs to
learn it well. The same way there are programmers who debug exclusively with
&lt;code&gt;print()&lt;/code&gt;, there are programmers who only use three git commands: &lt;code&gt;pull&lt;/code&gt;,
&lt;code&gt;commit&lt;/code&gt;, and &lt;code&gt;push&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That is fine. You can go a long way without ever needing more advanced git
commands (or debugging tools). That also means, however, that the
justifications behind good-practice advice will be harder to understand. I will
try to make those clearer in this post even if you don&#x27;t go beyond these three
git commands.&lt;/p&gt;
&lt;p&gt;The tragedy of it all, however, is that not knowing those advanced use cases
before creating a repository might jeopardise the ability of advanced users to
take advantage of good commit etiquette.&lt;/p&gt;
&lt;p&gt;Some code bases have a &amp;quot;before good commits&amp;quot; and &amp;quot;after good commits&amp;quot;. The
&amp;quot;before&amp;quot; is usually a dark place we don&#x27;t like to go.&lt;/p&gt;
&lt;p&gt;Make up your reasons wisely and trust the advice of people who have been there
before and learned the hard lessons.&lt;/p&gt;
&lt;h2&gt;The Summary&lt;/h2&gt;
&lt;p&gt;Before going into the reasons why the advice from those posts is still sound,
here is a very short summary of the two articles above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Do not mix two unrelated functional changes in the same commit&lt;/strong&gt;: It&#x27;s hard
to catch flaws during review when changes are mixed together. If the commit
needs to be reverted, the two changes need to be untangled first. Similarly,
it is harder to bisect and find which change created a bug if multiple
functional changes are included in a commit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Do not assume the reviewer uses the same tools as you&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Do not assume the reviewer has access to an external website&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These provide justification for seemingly arbitrary content on the linked blog
posts such as &lt;em&gt;&amp;quot;commit titles should be no longer than 50 characters and commit
bodies no longer than 72 characters&amp;quot;.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Do not mix two unrelated functional changes in the same commit&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;So commit messages to me are almost as important as the code change itself.
Sometimes the code change is so obvious that no message is really required,
but that is very very rare. And so one of the things I hope developers are
thinking about, the people who are actually writing code, is not just the
code itself, but explaining why the code does something, and why some change
was needed. Because that then in turn helps the managerial side of the
equation, where if you can explain your code to me, I will trust the code...&lt;/p&gt;
&lt;p&gt;Linus Torvalds.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let&#x27;s start with a &lt;strong&gt;bad&lt;/strong&gt; example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;commit e5b18b256c0f4f5d369c62785248632075790867 (HEAD -&amp;gt; master)
Author: John Doe &amp;lt;john.doe@gmail.com&amp;gt;
Date:   Sat Oct 5 18:33:21 2024 +1300

    Revamp customer profile page

    This commit:

      - Refactor ResetPassword form UI to reuse textbox component.
      - Add an index to the &amp;quot;users&amp;quot; table to lookup emails faster.
      - Apply compression to user&#x27;s uploaded profile pictures.
      - Change hash algorithm for profile picture names.
      - Fix broken layout on mobile devices using landscape format.
      - Add a new canary flag to control &amp;quot;Under Maintanence&amp;quot; banner.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Although John Doe&#x27;s commit title evokes the idea that there&#x27;s only one thing
happening, a closer look at the commit description reveals that there are
many unrelated changes sneaking in at the same time.&lt;/p&gt;
&lt;p&gt;Why is this bad? Let&#x27;s start with a simple example.&lt;/p&gt;
&lt;p&gt;Suppose the third change has a bug:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&amp;quot;Apply compression to user&#x27;s uploaded profile pictures&amp;quot;&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;profile_update.c&lt;/code&gt; file where all the operations for updating a user&#x27;s
profile live has a code-path that crashes the server.&lt;/p&gt;
&lt;p&gt;Naturally you want to revert that commit. But in the meanwhile John&#x27;s colleague
Mary has changed one of the UI layout files that John&#x27;s commit had also touched
as part of an unrelated change:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;commit 85cf1a1501a2062dbc9310d6b598dcf72e284cbc (HEAD -&amp;gt; master)
Author: Mary Silva &amp;lt;mary.silva@gmail.com&amp;gt;
Date:   Sat Oct 6 20:45:40 2024 +1300

    Upgrade profile page UI layout

    This commit:

      - Move css classes to the new file &amp;quot;user_layout.css&amp;quot;.
      - Refactor text boxes to use the same css style.
      - Remove unreacheable (dead) JavaScript code.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can&#x27;t revert John&#x27;s commit because you got a conflict.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git revert e5b18b256c0f4f5d369c62785248632075790867

CONFLICT (modify/delete): README.md deleted in (empty tree) and modified in
HEAD.  Version HEAD of README.md left in tree.

error: could not revert e5b18b2... Revamp customer profile page
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The bug in the &lt;code&gt;profile_update.c&lt;/code&gt; file has nothing to do with the UI layout in
the profile page.&lt;/p&gt;
&lt;p&gt;When John added all of those unrelated changes in a single commit, his
commit became a &lt;strong&gt;conflict magnet&lt;/strong&gt;. Conflict magnet commits are very hard to
revert.&lt;/p&gt;
&lt;h3&gt;Bisecting&lt;/h3&gt;
&lt;p&gt;In this case we already knew that John&#x27;s commit introduced a bug, but what if
we didn&#x27;t? &lt;code&gt;git bisect&lt;/code&gt; is a git tool built for finding &lt;strong&gt;where&lt;/strong&gt; a bug was
introduced.&lt;/p&gt;
&lt;p&gt;To use &lt;code&gt;git bisect&lt;/code&gt; you give it two arguments: A &amp;quot;bad&amp;quot; commit that is known to
&lt;strong&gt;contain&lt;/strong&gt; the bug (even if not introduced by that commit itself), and a
&amp;quot;good&amp;quot; commit that is known to be before the bug was introduced.&lt;/p&gt;
&lt;p&gt;The short version of what bisect does is: Bisect will pick a commit between
the &amp;quot;bad&amp;quot; and &amp;quot;good&amp;quot; one and ask you whether it&#x27;s good or bad. It is up to you
to decide.&lt;/p&gt;
&lt;p&gt;How you do that depends on the project, you might run the test suite with a
test that reproduces the bug, or simply look at the diff changes. In each
iteration, &lt;code&gt;git bisect&lt;/code&gt; shrinks the search window until John&#x27;s offending commit
is found.&lt;/p&gt;
&lt;p&gt;Inevitably you find John&#x27;s commit. However, the commit has two changes in the
same &lt;code&gt;profile_update.c&lt;/code&gt; file that causes the bug.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;  - Apply compression to user&#x27;s uploaded profile pictures.
  - Change hash algorithm for profile picture names.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So which of the changes is the bad one?&lt;/p&gt;
&lt;p&gt;That question might not be trivial to answer. It might be hard to untangle
which change actually broke the code. Specially if the compression algorithm
and the hash algorithm use the same underling routines.&lt;/p&gt;
&lt;h3&gt;Function evolutions&lt;/h3&gt;
&lt;p&gt;Another tool that can&#x27;t be used as well if multiple changes are present in the
same commit is &lt;code&gt;git log -L&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;git log -L :&amp;lt;funcname&amp;gt;:&amp;lt;file_name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By running the command above, git will display a diff with all the commits
that touched that function in the past.&lt;/p&gt;
&lt;p&gt;This presents a way to see the &amp;quot;evolution&amp;quot; of a function over time.&lt;/p&gt;
&lt;p&gt;You want commits to be split so that you can see which individual patches
changed that function.&lt;/p&gt;
&lt;p&gt;Having a single commit with too much noise makes that more difficult to
understand &lt;strong&gt;why&lt;/strong&gt; that function changed.&lt;/p&gt;
&lt;p&gt;If you are not convinced yet, there are many more git tools that are affected
by non-atomic commits. Check the list below and see if may you use any of the
following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;git blame&lt;/code&gt;: Atomic commits give you the direct answer to: &amp;quot;Why was this
change made?&amp;quot;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;git rebase&lt;/code&gt;: For rebasing, dropping changes, re-editing commit messages, or
adding fix-ups.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;git cherry-pick&lt;/code&gt;: For applying a specific commit from one branch to another.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;git diff&lt;/code&gt;: For seeing one change at a time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&#x27;s reasonable to conclude that one commit per functional change is still
relevant today.&lt;/p&gt;
&lt;h2&gt;Do not assume the reviewer uses the same tools as you&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Word-wrapping is a property of the text. And the tool you use to
visualize things cannot know. End result: you do word-wrapping at the
only stage where you can do it, namely when writing it. Not when
showing it.&lt;/p&gt;
&lt;p&gt;Some things should not be word-wrapped. They may be some kind of
quoted text - long compiler error messages, oops reports, whatever.
Things that have a certain specific format.&lt;/p&gt;
&lt;p&gt;The tool displaying the thing can&#x27;t know. The person writing the
commit message can. End result: you&#x27;d better do word-wrapping at
commit time, because that&#x27;s the only time you know the difference.&lt;/p&gt;
&lt;p&gt;(And the rule is not 80 characters, because you do want to allow the
standard indentation from git log, and you do want to leave some room
for quoting).&lt;/p&gt;
&lt;p&gt;Linus Torvalds&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is what a long commit looks like with the default pager (&lt;code&gt;less&lt;/code&gt; on most
*nix systems).&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;long_commit.png&quot; alt=&quot;long_commit.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Those white arrows at the right-hand side show where the text was truncated.&lt;/p&gt;
&lt;p&gt;Although &lt;code&gt;less&lt;/code&gt; supports wrapping text, it may not be on by default depending
on how your OS came configured.&lt;/p&gt;
&lt;p&gt;Every command that takes the commit summary (top line) truncate and become
unreadable. More examples of such commands are found on Tim Pope&#x27;s &lt;a href=&quot;http://web.archive.org/web/20240930112522/https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html&quot;&gt;blog
post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There isn&#x27;t much I can add here. I think that it is important to have good
writing skills for both commit titles and commit messages with the goal of
keeping them succinct and informative at the same time. You can see for
yourself how nice &lt;a href=&quot;https://github.com/torvalds/linux/commits/master/&quot;&gt;Linux&#x27;s kernel git
log&lt;/a&gt; reads for inspiration.&lt;/p&gt;
&lt;p&gt;The kernel has a &lt;a href=&quot;https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=bc7938deaca7f474918c41a0372a410049bd4e13#n664&quot;&gt;restrictive line-length
limit&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the &lt;code&gt;summary&lt;/code&gt; must be no more than 70-75 characters.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;No text gets wrapped or truncated, and everything is nice and in &lt;em&gt;good&lt;/em&gt; style.&lt;/p&gt;
&lt;p&gt;It is true that there is no strong evidence about what the &amp;quot;ideal&amp;quot; line-length
&lt;strong&gt;for coding&lt;/strong&gt; is. But we know that number for &lt;strong&gt;human-readable text&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Research has led to recommendations that line length should not exceed about
70 characters per line. The reason behind this finding is that both very
short and very long lines slow down reading by interrupting the normal
pattern of eye movements and movements throughout the text.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;https://www.researchgate.net/publication/234578707_Optimal_Line_Length_in_Reading--A_Literature_Review&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Throw no stones! We are talking about &lt;strong&gt;human-readable&lt;/strong&gt;, i.e., text like
books, magazines, papers, &lt;strong&gt;and git logs!!!&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;As Linus has explained on the quote at the top of this section, the &lt;strong&gt;writer&lt;/strong&gt;
is responsible for wrapping the text because the &lt;em&gt;pager&lt;/em&gt; tool might not be able
to do it for the &lt;strong&gt;reader&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The default pager (&lt;code&gt;less&lt;/code&gt;) is not the only tool that truncates instead of
wrapping, even in 2024 we haven&#x27;t found the magic solution for perfect
text-wrapping yet.&lt;/p&gt;
&lt;p&gt;Even GitHub truncates long commit titles to 72 characters. Be mindful of that
when committing long titles and messages.&lt;/p&gt;
&lt;h2&gt;Do not assume the reviewer has access to an external website&lt;/h2&gt;
&lt;p&gt;I worked for a company that used to use GitHub for issue tracking as well as
repository hosting. Many of our commit messages merely pointed at GitHub links
and had no description at all.&lt;/p&gt;
&lt;p&gt;It was sad to see the company getting acquired and the parent company moving to
BitBucket while deleting the old GitHub account.&lt;/p&gt;
&lt;p&gt;It is OK to link to GitHub, Jira, Asana, etc., but the most important thing is
to &lt;strong&gt;make sure&lt;/strong&gt; the commit message has everything you need in it so that you
don&#x27;t depend on external services.&lt;/p&gt;
&lt;h2&gt;Outro&lt;/h2&gt;
&lt;p&gt;Linus on Tim Pope&#x27;s post:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;linus_on_tpope.png&quot; alt=&quot;linus_on_tpope.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;Further Reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/torvalds/linux/pull/17&quot;&gt;https://github.com/torvalds/linux/pull/17&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-10-05T00:00:00Z</published><updated>2024-10-05T00:00:00Z</updated></entry><entry><title>Postgres Unique Constraints Without Downtime</title><link href="https://www.marcelofern.com/posts/postgres/unique-constraints-without-downtime/index.html"/><id>tag:marcelofern.com,2024-10-01:/postgres/unique-constraints-without-downtime/index.html</id><content type="html">&lt;h1&gt;Postgres Unique Constraints Without Downtime&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-10-01
Updated at: 2024-10-08
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The syntax for adding a unique constraint in Postgres is as follow:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;ALTER TABLE &amp;quot;table_name&amp;quot;
ADD CONSTRAINT &amp;quot;unique_constraint_on_foo&amp;quot;
UNIQUE (&amp;quot;foo&amp;quot;);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This constraint will prevent multiple rows having the same value stored in
the &lt;code&gt;foo&lt;/code&gt; column.&lt;/p&gt;
&lt;p&gt;However, this operation acquires an &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; lock, blocking all reads
and writes to the table until it&#x27;s finished.&lt;/p&gt;
&lt;p&gt;If you are adding a unique constraint to a large table, the amount of time
spent to create the constraint might be prohibitive.&lt;/p&gt;
&lt;h2&gt;How To Safely Add a Unique Constraint Without Downtime&lt;/h2&gt;
&lt;p&gt;From the Postgres documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;PostgreSQL automatically creates a unique index when a unique constraint or
primary key is defined for a table.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Creating this index on the background while holding an &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; is
the problem we are trying to avoid.&lt;/p&gt;
&lt;p&gt;What we want to do is &lt;strong&gt;create the index first&lt;/strong&gt;, and &lt;strong&gt;CONCURRENTLY&lt;/strong&gt;, so that
when we add the constraint to the table, the table can use the already existing
index. This will make the subsequent &lt;code&gt;ALTER TABLE&lt;/code&gt; much faster to run.&lt;/p&gt;
&lt;p&gt;If you have the following table:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;CREATE TABLE example_table (
    id SERIAL PRIMARY KEY,
    int_field INT
);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can create a unique index concurrently (this won&#x27;t block any reads or
writes on this table), with the following command:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SET lock_timeout &#x27;0&#x27;;

CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS unique_int_field_idx
ON example_table (int_field);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Side Note 1: If you are using any value of &lt;code&gt;lock_timeout&lt;/code&gt; that is not zero, you
have to set it to zero before you create the index. This will prevent leaving
an invalid index behind if the operation fails due to a time out.&lt;/p&gt;
&lt;p&gt;Side Note 2: You cannot use a partial index here. Postgres allows the creation
of partial unique indexes, but it does not allow the creation of partial unique
constraint. The documentation states
&lt;a href=&quot;https://web.archive.org/web/20240928225017/https://www.postgresql.org/docs/current/ddl-constraints.html&quot;&gt;source&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A uniqueness restriction covering only some rows cannot be written as a
unique constraint, but it is possible to enforce such a restriction by
creating a unique partial index.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you try to use a partial index to create a unique constraint, Postgres will
raise the following error:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ERROR:  &amp;quot;unique_int_field_idx&amp;quot; is a partial index
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Therefore, if you need a partial unique restriction, just keep your index. It
will be enough.&lt;/p&gt;
&lt;p&gt;Once this command finished, you can add the new constraint USING the index
above:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SET lock_timeout &#x27;10s&#x27;

ALTER TABLE example_table
ADD CONSTRAINT unique_int_field UNIQUE USING INDEX unique_int_field_idx;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The operation above takes virtually no time.&lt;/p&gt;
&lt;p&gt;Note: I have reset lock_timeouts to a reasonable value (10s). This is a
safeguard. If there is a long-running transaction that would block the ALTER
TABLE statement, which in turn would block all reads and writes, the statement
will time out instead of causing a potential outage.&lt;/p&gt;
&lt;h2&gt;Why not just use the index for constraint validation?&lt;/h2&gt;
&lt;p&gt;Both the unique index and constraint raise the same error when an insert
attempt fails: &amp;quot;duplicate key value violates unique constraint.&amp;quot;&lt;/p&gt;
&lt;p&gt;So why would one bother even creating the constraint if the index suffice?&lt;/p&gt;
&lt;p&gt;From an old (v9.4) Postgres documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: The preferred way to add a unique constraint to a table is ALTER TABLE
... ADD CONSTRAINT. The use of indexes to enforce unique constraints could be
considered an implementation detail that should not be accessed directly. One
should, however, be aware that there&#x27;s no need to manually create indexes on
unique columns; doing so would just duplicate the automatically-created
index.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;https://www.postgresql.org/docs/9.4/indexes-unique.html&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This note has since been removed from Postgres since version 9.5.
The commit that removed the note (&lt;a href=&quot;https://github.com/postgres/postgres/commit/049a7799dfc&quot;&gt;049a7799dfc&lt;/a&gt;) says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;docs: remove outdated note about unique indexes&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There is no guidance on why that was outdated and how unique indexes should be
interpreted.&lt;/p&gt;
&lt;p&gt;The differences remaining are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Constraints can be deferred.&lt;/li&gt;
&lt;li&gt;Indexes can be partial, which is useful if uniqueness is restricted to a
subset of data. You &lt;strong&gt;cannot&lt;/strong&gt; add a table constraint from a partial index.
Make sure your index wasn&#x27;t created with &amp;quot;WHERE ...&amp;quot;.&lt;/li&gt;
&lt;li&gt;If you care about the SQL standard, constraints are part of it, whereas
indexes aren&#x27;t (they&#x27;re an implementation detail).&lt;/li&gt;
&lt;li&gt;External tools that care about uniqueness being defined through constraints
might care about it and not work properly if the constraint isn&#x27;t defined on
the schema.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Timing Different Approaches&lt;/h2&gt;
&lt;p&gt;The Python script below times how long it takes to add a constraint using two
approaches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ALTER TABLE &lt;strong&gt;without&lt;/strong&gt; a pre-existing index.&lt;/li&gt;
&lt;li&gt;ALTER TABLE &lt;strong&gt;with&lt;/strong&gt; a pre-existing index.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note: These results were taken from a &lt;strong&gt;local&lt;/strong&gt; database without any
concurrency.&lt;/p&gt;
&lt;p&gt;TLDR: Creating an index concurrently first, and then using it to create the
constraint takes a little longer in total, but is a much safer approach.&lt;/p&gt;
&lt;p&gt;First, the results in, and then the script:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;results.png&quot; alt=&quot;results&quot;&gt;&lt;/p&gt;
&lt;p&gt;CSV:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;rows,unique constraint without index,unique index,unique constraint using index
1000000, 0.18, 0.23, 0.0
2000000, 0.38, 0.51, 0.0
3000000, 0.57, 0.8, 0.0
4000000, 0.74, 1.06, 0.0
5000000, 0.9, 1.3, 0.0
6000000, 1.11, 1.55, 0.0
7000000, 1.25, 1.84, 0.01
8000000, 1.58, 2.14, 0.0
9000000, 1.61, 2.4, 0.0
10000000, 1.78, 2.52, 0.0
20000000, 3.72, 4.97, 0.0
30000000, 5.52, 8.07, 0.0
40000000, 7.93, 10.77, 0.0
50000000, 10.16, 13.91, 0.0
60000000, 12.17, 17.24, 0.0
70000000, 15.74, 22.68, 0.01
80000000, 25.58, 38.55, 0.0
90000000, 40.45, 58.64, 0.0
100000000, 52.37, 60.59, 0.0
200000000, 124.62, 168.0, 0.0
300000000, 198.78, 293.85, 0.0
400000000, 250.16, 340.47, 0.01
500000000, 323.56, 435.08, 0.0
600000000, 395.54, 541.21, 0.0
700000000, 491.53, 724.37, 0.0
800000000, 589.03, 782.12, 0.0
900000000, 635.95, 909.58, 0.0
1000000000, 734.21, 1059.35, 0.01
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The script&lt;/h2&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;import psycopg2
import time
import os


def get_cursor_and_connection():
    # Update connection details as per your PostgreSQL setup
    conn = psycopg2.connect(
        dbname=&amp;quot;test_db&amp;quot;,
        user=&amp;quot;postgres&amp;quot;,
        password=&amp;quot;postgres&amp;quot;,
        host=&amp;quot;localhost&amp;quot;,
        port=&amp;quot;5441&amp;quot;,
    )
    conn.autocommit = True
    return conn.cursor(), conn


def vacuum_table(cursor):
    &amp;quot;&amp;quot;&amp;quot;
    Vacuum is necessary to optimise the table
    structure before we perform the benchmark.

    It ensures that the performance tests are
    not affected by any leftover internal
    inconsistencies or unnecessary disk overhead
    from unvacuumed data.

    It also prevents autovacuum&#x27;ing from interfering
    on test results.

    The ANALYZE part is there to better inform
    Postgres on how to find the best planner
    for the ALTER TABLE / index
    &amp;quot;&amp;quot;&amp;quot;
    print(&amp;quot;Vacuuming example_table...&amp;quot;)
    cursor.execute(&amp;quot;VACUUM ANALYZE example_table;&amp;quot;)


def create_table(cursor):
    print(&amp;quot;Creating table...&amp;quot;)
    cursor.execute(&amp;quot;&amp;quot;&amp;quot;
        DROP TABLE IF EXISTS example_table;
        CREATE TABLE example_table (
            id SERIAL PRIMARY KEY,
            int_field INT
        );
    &amp;quot;&amp;quot;&amp;quot;)


def insert_data(cursor, num_rows):
    print(f&amp;quot;Inserting {num_rows} unique rows...&amp;quot;)
    cursor.execute(f&amp;quot;&amp;quot;&amp;quot;
        INSERT INTO example_table (int_field)
        SELECT s
        FROM (
            SELECT generate_series(1, {num_rows}) AS s
            ORDER BY RANDOM()
        ) AS shuffled;
    &amp;quot;&amp;quot;&amp;quot;)


def clear_cache():
    print(&amp;quot;Clearing cache by restarting docker container...&amp;quot;)
    os.system(&amp;quot;docker restart postgres15&amp;quot;)
    print(&amp;quot;sleeping for 10s&amp;quot;)
    time.sleep(10)


def add_unique_constraint(cursor, conn):
    print(&amp;quot;Adding unique constraint directly...&amp;quot;)
    start_time = time.time()
    cursor.execute(&amp;quot;&amp;quot;&amp;quot;
        ALTER TABLE example_table
        ADD CONSTRAINT unique_int_field UNIQUE (int_field);
    &amp;quot;&amp;quot;&amp;quot;)
    conn.commit()
    duration = time.time() - start_time
    print(f&amp;quot;Time taken to add unique constraint: {duration:.2f} seconds&amp;quot;)
    return duration


def add_unique_constraint_with_index_first(cursor, conn):
    print(&amp;quot;Creating unique index...&amp;quot;)
    idx_start_time = time.time()
    cursor.execute(&amp;quot;&amp;quot;&amp;quot;
        CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS unique_int_field_idx
        ON example_table (int_field);
    &amp;quot;&amp;quot;&amp;quot;)
    conn.commit()
    idx_duration = time.time() - idx_start_time
    print(f&amp;quot;Time taken to add index: {idx_duration:.2f} seconds&amp;quot;)

    print(&amp;quot;Adding unique constraint using index...&amp;quot;)
    start_time = time.time()
    cursor.execute(&amp;quot;&amp;quot;&amp;quot;
        ALTER TABLE example_table
        ADD CONSTRAINT unique_int_field UNIQUE USING INDEX unique_int_field_idx;
    &amp;quot;&amp;quot;&amp;quot;)
    conn.commit()
    constraint_duration = time.time() - start_time
    print(
        f&amp;quot;Time taken to add unique constraint with index first: {constraint_duration:.2f} seconds&amp;quot;
    )
    return idx_duration, constraint_duration


def run_tests(table_sizes):
    results = {}

    for num_rows in table_sizes:
        print(f&amp;quot;\nRunning tests with {num_rows} rows...&amp;quot;)

        # Get a cursor to run queries.
        cursor, conn = get_cursor_and_connection()

        # Create the table and insert data
        create_table(cursor)
        insert_data(cursor, num_rows)
        conn.commit()

        # Vacuum the table after inserting rows
        vacuum_table(cursor)

        # Clear OS cache and get a new cursor
        clear_cache()
        cursor, conn = get_cursor_and_connection()

        # Test 1: Add unique constraint directly
        time_direct = add_unique_constraint(cursor, conn)

        # Clear OS cache again and get a new cursor
        clear_cache()
        cursor, conn = get_cursor_and_connection()

        # Test 2: Create index first, then add unique constraint
        create_table(cursor)  # Drop and recreate the table
        insert_data(cursor, num_rows)
        conn.commit()
        vacuum_table(cursor)

        # Clear OS cache and get a new cursor
        clear_cache()
        cursor, conn = get_cursor_and_connection()

        idx_duration, constraint_duration = add_unique_constraint_with_index_first(
            cursor, conn
        )

        results[num_rows] = {
            &amp;quot;direct_constraint&amp;quot;: time_direct,
            &amp;quot;index_then_constraint&amp;quot;: {
                &amp;quot;idx_duration&amp;quot;: idx_duration,
                &amp;quot;constraint_duration&amp;quot;: constraint_duration,
            },
        }

    cursor.close()
    conn.close()

    # Print final results
    print(&amp;quot;\nTest Results:&amp;quot;)
    for num_rows, times in results.items():
        print(f&amp;quot;Rows: {num_rows}&amp;quot;)
        print(f&amp;quot; - Direct constraint: {times[&#x27;direct_constraint&#x27;]:.2f} seconds&amp;quot;)
        print(
            f&amp;quot; - Index then constraint: \n&amp;quot;
            f&amp;quot;   - {times[&#x27;index_then_constraint&#x27;][&#x27;idx_duration&#x27;]:.2f} seconds (idx)\n&amp;quot;
            f&amp;quot;   - {times[&#x27;index_then_constraint&#x27;][&#x27;constraint_duration&#x27;]:.2f} seconds (constraint)&amp;quot;
        )


table_sizes = [
    1_000_000,
    2_000_000,
    3_000_000,
    4_000_000,
    5_000_000,
    6_000_000,
    7_000_000,
    8_000_000,
    9_000_000,
    10_000_000,
    20_000_000,
    30_000_000,
    40_000_000,
    50_000_000,
    60_000_000,
    70_000_000,
    80_000_000,
    90_000_000,
    100_000_000,
    200_000_000,
    300_000_000,
    400_000_000,
    500_000_000,
    600_000_000,
    700_000_000,
    800_000_000,
    900_000_000,
    1_000_000_000,
]
run_tests(table_sizes)
&lt;/code&gt;&lt;/pre&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-10-01T00:00:00Z</published><updated>2024-10-08T00:00:00Z</updated></entry><entry><title>Why Is This Site Built With C</title><link href="https://www.marcelofern.com/posts/c/why-is-this-site-built-with-c/index.html"/><id>tag:marcelofern.com,2024-08-26:/c/why-is-this-site-built-with-c/index.html</id><content type="html">&lt;h1&gt;Why Is This Site Built With C&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-08-26
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I&#x27;ve been writing about &lt;strong&gt;things&lt;/strong&gt; on a personal website since
&lt;a href=&quot;http://web.archive.org/web/20171124021420/http://marcelonet.com/snippets/&quot;&gt;2017&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Most of what I have written features in the category of notes-to-self. Mostly
on how to do A or B.&lt;/p&gt;
&lt;p&gt;Only recently I&#x27;ve started polishing notes together and forming posts on
specific topics.&lt;/p&gt;
&lt;p&gt;One thing I realised was preventing me of writing more frequently wasn&#x27;t
the lack of ideas (or motivation), but the trouble of having to deal with
the website builder and platform I was using at the time.&lt;/p&gt;
&lt;p&gt;GitHub pages didn&#x27;t exist at the time and the canonical way was to have an
Apache server running the website in some web provider. I didn&#x27;t know anything
about Apache and the little I saw didn&#x27;t interest me, so I looked for an
alternative.&lt;/p&gt;
&lt;p&gt;I built my first website with Django (serviced by Nginx) in a server hosted on
Digital Ocean. This is before the Droplets-era, so I had to rent an Ubuntu
machine which costed $5.00 USD per month. That was a bit steep for a dev on a
Brazilian salary considering I had to pay for other services too (registrar,
email, etc).&lt;/p&gt;
&lt;p&gt;I was highly motivated to post things as I was still fresh in the web
development world and wanted to know how everything worked. I also had no idea
what I was doing and wanted my own website to be a sandbox where I could try
new things out.&lt;/p&gt;
&lt;p&gt;That was my first mistake. Building a &amp;quot;static&amp;quot; website with Django is too
cumbersome. You have to set up views, templates, run the server, get GitHub
hooks for resetting the remote server in Digital Ocean when new commits are
pushed, etc.&lt;/p&gt;
&lt;p&gt;Once the romantic view of a newbie blog-poster faded away, handling the whole
apparatus to publish a note took more time than writing the note itself.&lt;/p&gt;
&lt;p&gt;At some point I had to make a switch before the website grew too big.&lt;/p&gt;
&lt;p&gt;My second take was to ditch the whole website and start from scratch using
a static website generator. I decided to use Nuxt because I was using Vue at
work and the whole set up looked simple to start with.&lt;/p&gt;
&lt;p&gt;It was nice in the beginning. I set it up with GitHub Pages. I only had to
get the static site that Nuxt creates via a cli command pushed to my git repo
and GitHub handled the rest for me. That was a major improvement over the
previous infrastructure. On top of that, I could do cool dynamic things with
JavaScript being embedded and having the framework to interact with it.&lt;/p&gt;
&lt;p&gt;But I only had one blog post where I needed fancy JavaScript tooling. Soon it
became painful to maintain the website again. Publishing posts involved writing
things in Vue and that was just not an ergonomic way to write regular prose.&lt;/p&gt;
&lt;p&gt;Also the framework was a new technology, and maintainers were pushing updates
that broke backwards compatibility. Handling versioning of Vue and Nuxt along
with all their JavaScript dependencies was a big pain point and I had to give
up at some point.&lt;/p&gt;
&lt;h2&gt;Now&lt;/h2&gt;
&lt;p&gt;Learning from these two past mistakes, I came up with a set of requirements for
my next (and hopefully final) website:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Starting a post must be as easy as typing into a blank file.&lt;/li&gt;
&lt;li&gt;The website must be statically generated. And Fast.&lt;/li&gt;
&lt;li&gt;There should be little to none dependencies for generating the website.&lt;/li&gt;
&lt;li&gt;It needs to last for at least the next 10 years.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The first requirement is satisfied by writing using markdown files. Writing
this blog post in Neovim looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;markdown.png&quot; alt=&quot;markdown&quot;&gt;&lt;/p&gt;
&lt;p&gt;The second requirement is a bit trickier, but it is directly related to the
third.&lt;/p&gt;
&lt;p&gt;Writing posts in markdown means that there needs to be a parser to convert the
files to html. I could either code this parser myself and have a beautiful
static site generator with zero dependencies, or I can allow myself a single
dependency.&lt;/p&gt;
&lt;p&gt;The problem is that the move from zero dependencies to one dependency is huge.
It feels way bigger than going from 10 dependencies to 100 dependencies.&lt;/p&gt;
&lt;p&gt;The problem is that writing a markdown parser isn&#x27;t the most trivial
enterprise. At the same time, the parser was the only dependency I needed to
have. I managed to convince myself that a dependency was okay and then I moved
on.&lt;/p&gt;
&lt;p&gt;My first instinct was to reach out to &lt;a href=&quot;https://pandoc.org/&quot;&gt;Pandoc&lt;/a&gt;. I did so
and implemented a small shell script that could read my directory tree of
markdown files and transpose them to html.&lt;/p&gt;
&lt;p&gt;That worked fine for about 20 to 30 markdown files. After that, the process of
converting files to html started to deteriorate in speed. Pandoc is written in
Haskell, and it is not known for being fast at parsing large volumes of files.&lt;/p&gt;
&lt;p&gt;An alternative for saving time with recompilation was to update my script so
that only new markdown files or changed ones are marked for recompilation. That
would involve too much wizardry if I wanted to make the script nice and robust.
I didn&#x27;t want to do that. I didn&#x27;t want my script to grow so much that I would
need to start adding test cases and coverage.&lt;/p&gt;
&lt;p&gt;I also knew that parsing hundreds or even thousands of small files should be
doable in single-digit seconds. The problem was that Pandoc slowed everything
down, so my second requirement was not met.&lt;/p&gt;
&lt;p&gt;More over, the whole Pandoc ecosystem requires &lt;strong&gt;a lot of&lt;/strong&gt; of dependencies.
227 dependencies and over 400MB of installed size to be exact:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Packages (227) ghc-libs-9.2.8-1  haskell-aeson-2.1.2.1-47  haskell-aeson-pretty-0.8.10-7
               haskell-ansi-terminal-0.11.4-66  haskell-ansi-wl-pprint-0.6.9-418
               haskell-appar-0.1.8-14  haskell-asn1-encoding-0.9.6-230
               haskell-asn1-parse-0.9.5-230  haskell-asn1-types-0.3.4-209  haskell-assoc-1.0.2-266
               haskell-async-2.2.5-27  haskell-attoparsec-0.14.4-74
               haskell-attoparsec-aeson-2.1.0.0-31  haskell-attoparsec-iso8601-1.1.0.0-50
               haskell-auto-update-0.1.6-339  haskell-base-compat-0.12.2-2
               haskell-base-compat-batteries-0.12.2-83  haskell-base-orphans-0.8.8.2-13
               haskell-base-unicode-symbols-0.2.4.2-14  haskell-base16-bytestring-1.0.2.0-80
               haskell-base64-0.4.2.4-69  haskell-base64-bytestring-1.2.1.0-104
               haskell-basement-0.0.16-2  haskell-bifunctors-5.6-77  haskell-bitvec-1.1.3.0-94
               haskell-blaze-builder-0.4.2.3-2  haskell-blaze-html-0.9.1.2-226
               haskell-blaze-markup-0.8.3.0-10  haskell-boring-0.2.1-3
               haskell-bsb-http-chunked-0.0.0.4-383  haskell-byteorder-1.0.4-25
               haskell-call-stack-0.4.0-184  haskell-case-insensitive-1.2.1.0-203
               haskell-cassava-0.5.3.1-4  haskell-cereal-0.5.8.3-2  haskell-citeproc-0.8.1-105
               haskell-cmdargs-0.10.22-2  haskell-colour-2.3.6-210  haskell-commonmark-0.2.4.1-1
               haskell-commonmark-extensions-0.2.4-2  haskell-commonmark-pandoc-0.2.1.3-82
               haskell-comonad-5.0.8-261  haskell-conduit-1.3.5-53  haskell-conduit-extra-1.3.6-134
               haskell-constraints-0.13.4-50  haskell-contravariant-1.5.5-4  haskell-cookie-0.4.6-2
               haskell-crypton-0.34-11  haskell-crypton-connection-0.3.2-8
               haskell-crypton-x509-1.7.6-28  haskell-crypton-x509-store-1.6.9-28
               haskell-crypton-x509-system-1.6.7-28  haskell-crypton-x509-validation-1.6.12-28
               haskell-data-array-byte-0.1.0.1-55  haskell-data-default-0.7.1.1-306
               haskell-data-default-class-0.1.2.0-25
               haskell-data-default-instances-containers-0.0.1-37
               haskell-data-default-instances-dlist-0.0.1-319
               haskell-data-default-instances-old-locale-0.0.1-37  haskell-data-fix-0.3.2-102
               haskell-dec-0.0.5-5  haskell-digest-0.0.1.7-2  haskell-digits-0.3.1-21
               haskell-distributive-0.6.2.1-209  haskell-dlist-1.0-241
               haskell-doclayout-0.4.0.1-29  haskell-doctemplates-0.11-71
               haskell-easy-file-0.2.5-21  haskell-emojis-0.1.3-10  haskell-erf-2.0.0.0-25
               haskell-fast-logger-3.1.2-74  haskell-file-embed-0.0.15.0-2
               haskell-foldable1-classes-compat-0.1-77  haskell-generically-0.1.1-2
               haskell-ghc-bignum-orphans-0.1.1-2  haskell-glob-0.10.2-90
               haskell-gridtables-0.1.0.0-48  haskell-haddock-library-1.11.0-17
               haskell-hashable-1.4.3.0-46  haskell-hourglass-0.2.12-246  haskell-hslua-2.3.0-52
               haskell-hslua-aeson-2.3.0.1-34  haskell-hslua-classes-2.3.0-53
               haskell-hslua-core-2.3.1-45  haskell-hslua-list-1.1.1-60
               haskell-hslua-marshalling-2.3.1-5  haskell-hslua-module-doclayout-1.1.0-58
               haskell-hslua-module-path-1.1.0-53  haskell-hslua-module-system-1.1.0.1-27
               haskell-hslua-module-text-1.1.0.1-27  haskell-hslua-module-version-1.1.0-53
               haskell-hslua-module-zip-1.1.1-22  haskell-hslua-objectorientation-2.3.0-49
               haskell-hslua-packaging-2.3.1-14  haskell-hslua-repl-0.1.2-11
               haskell-hslua-typing-0.1.1-7  haskell-http-api-data-0.5.1-54
               haskell-http-client-0.7.15-23  haskell-http-client-tls-0.3.6.3-58
               haskell-http-date-0.0.11-136  haskell-http-media-0.8.1.1-14
               haskell-http-types-0.12.4-6  haskell-http2-4.1.0-22  haskell-hunit-1.6.2.0-227
               haskell-indexed-traversable-0.1.3-69
               haskell-indexed-traversable-instances-0.1.1.2-44
               haskell-integer-logarithms-1.0.3.1-7  haskell-iproute-1.7.12-82
               haskell-ipynb-0.2-139  haskell-isocline-1.0.9-2  haskell-jira-wiki-markup-1.5.1-22
               haskell-juicypixels-3.3.8-31  haskell-lexer-1.1.1-2  haskell-libyaml-0.1.4-5
               haskell-lpeg-1.0.4-26  haskell-lua-2.3.2-6  haskell-memory-0.18.0-8
               haskell-mime-types-0.1.2.0-2  haskell-mmorph-1.2.0-6
               haskell-monad-control-1.0.3.1-102  haskell-mono-traversable-1.0.17.0-8
               haskell-network-3.1.4.0-20  haskell-network-byte-order-0.1.7-2
               haskell-network-uri-2.6.4.2-31  haskell-old-locale-1.0.0.7-31
               haskell-old-time-1.1.0.4-2  haskell-onetuple-0.3.1-75  haskell-only-0.1-23
               haskell-optparse-applicative-0.17.1.0-29  haskell-ordered-containers-0.2.3-2
               haskell-pandoc-3.1.8-34  haskell-pandoc-lua-engine-0.2.1.2-23
               haskell-pandoc-lua-marshal-0.2.4-2  haskell-pandoc-server-0.1.0.5-39
               haskell-pandoc-types-1.23.1-21  haskell-pem-0.2.4-286  haskell-pretty-show-1.10-15
               haskell-prettyprinter-1.7.1-165  haskell-primitive-0.7.4.0-111
               haskell-psqueues-0.2.8.0-10  haskell-quickcheck-2.14.3-64  haskell-random-1.2.1.2-8
               haskell-recv-0.1.0-30  haskell-regex-base-0.94.0.2-3  haskell-regex-tdfa-1.3.2.2-44
               haskell-resourcet-1.2.6-51  haskell-safe-0.3.21-5
               haskell-safe-exceptions-0.1.7.4-21  haskell-scientific-0.3.7.0-113
               haskell-semialign-1.2.0.1-160  haskell-semigroupoids-5.3.7-142
               haskell-servant-0.20.1-12  haskell-servant-server-0.20-23  haskell-sha-1.6.4.4-20
               haskell-simple-sendfile-0.2.32-36  haskell-singleton-bool-0.1.7-3
               haskell-skylighting-0.14-15  haskell-skylighting-core-0.14-14
               haskell-skylighting-format-ansi-0.1-121
               haskell-skylighting-format-blaze-html-0.1.1.2-8
               haskell-skylighting-format-context-0.1.0.2-86
               haskell-skylighting-format-latex-0.1-121  haskell-socks-0.6.1-237
               haskell-some-1.0.5-2  haskell-sop-core-0.5.0.2-2  haskell-split-0.2.5-6
               haskell-splitmix-0.1.0.5-22  haskell-statevar-1.2.2-3
               haskell-streaming-commons-0.2.2.6-26  haskell-strict-0.4.0.1-240
               haskell-string-conversions-0.4.0.1-171  haskell-syb-0.7.2.4-8
               haskell-tagged-0.8.8-2  haskell-tagsoup-0.14.8-226  haskell-temporary-1.3-585
               haskell-texmath-0.12.8.4-15  haskell-text-conversions-0.3.1.1-63
               haskell-text-icu-0.8.0.5-2  haskell-text-short-0.1.5-79
               haskell-th-abstraction-0.4.5.0-2  haskell-th-compat-0.1.5-2  haskell-th-lift-0.8.4-2
               haskell-th-lift-instances-0.1.20-47  haskell-these-1.1.1.1-267
               haskell-time-compat-1.9.6.1-97  haskell-time-manager-0.0.1-35  haskell-tls-1.8.0-29
               haskell-toml-parser-1.3.1.3-18  haskell-transformers-base-0.4.6-102
               haskell-transformers-compat-0.7.2-2  haskell-type-equality-1.0.1-1
               haskell-typed-process-0.2.11.1-15  haskell-typst-0.3.2.1-32
               haskell-typst-symbols-0.1.4-2  haskell-unicode-collation-0.1.3.6-12
               haskell-unicode-data-0.4.0.1-33  haskell-unicode-transforms-0.4.0.1-74
               haskell-uniplate-1.6.13-223  haskell-unix-compat-0.7.1-16
               haskell-unix-time-0.4.13-1  haskell-unliftio-0.2.25.0-10
               haskell-unliftio-core-0.2.1.0-2  haskell-unordered-containers-0.2.20-18
               haskell-utf8-string-1.0.2-150  haskell-uuid-types-1.0.5.1-16
               haskell-vault-0.3.1.5-185  haskell-vector-0.13.1.0-31
               haskell-vector-algorithms-0.9.0.2-3  haskell-vector-stream-0.1.0.1-2
               haskell-wai-3.2.4-19  haskell-wai-app-static-3.1.9-14  haskell-wai-cors-0.2.7-355
               haskell-wai-extra-3.1.15-2  haskell-wai-logger-2.4.0-443  haskell-warp-3.3.30-59
               haskell-witherable-0.4.2-101  haskell-word8-0.1.3-23  haskell-xml-1.3.14-31
               haskell-xml-conduit-1.9.1.3-53  haskell-xml-types-0.3.8-9  haskell-yaml-0.11.11.2-49
               haskell-zip-archive-0.4.3.2-2  haskell-zlib-0.6.3.0-60  hslua-cli-1.4.1-49
               lua-lpeg-1.1.0-2  numactl-2.0.18-1  pandoc-cli-0.1.1.1-113

Total Download Size:    65.52 MiB
Total Installed Size:  473.35 MiB
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are too many dependencies for me to trust the environment will be stable
for a long time. The last thing I want to do is deal with backward incompatible
changes on my wee blog.&lt;/p&gt;
&lt;p&gt;I looked for a better alternative and found
&lt;a href=&quot;http://github.com/mity/md4c&quot;&gt;md4c&lt;/a&gt;, which is a parser written in C with no
dependencies other than the standard C library. It also has only one header
file and one source file, making it easy to embed it straight into any C
project.&lt;/p&gt;
&lt;p&gt;The only work I needed to do was to write a C script (which turned out to be
~250 LOC) to call md4c functions and parse my &lt;code&gt;md&lt;/code&gt; files, and then chuck those
converted files into the GitHub Pages repo.&lt;/p&gt;
&lt;p&gt;My website converter script, which is all in this &lt;a href=&quot;https://gist.github.com/marcelofern/896574e055a05d011449b00217600fe6&quot;&gt;250
LOC&lt;/a&gt;
source file (less md4c) is feature-complete and runs on any compiler that
supports the C standard from 1999 onwards. There&#x27;s no platform-dependent code
and it&#x27;s portable to Windows, Linux, and MacOS.&lt;/p&gt;
&lt;p&gt;It runs incredibly well. I have 87 markdown files at the moment and parsing all
those files at the same time from scratch can be done virtually
instantaneously:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[~] time ./scripts/website/converter.bin

real    0m0.115s
user    0m0.087s
sys     0m0.091s
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This allows me to completely flush the whole repo away and create it from
scratch in almost no time. I do not have to worry about creating specific
logic to just re-parse files that have changed or anything fancy like that,
which reduces the burden of maintenance and makes my script smaller and easier
to reason about.&lt;/p&gt;
&lt;p&gt;This result was way more reasonable than the amount of time Pandoc took to
parse a mere amount of 87 markdown files, which was over the double-digit
mark (of seconds).&lt;/p&gt;
&lt;h2&gt;Outro&lt;/h2&gt;
&lt;p&gt;One popular alternative of current days for this problem is
&lt;a href=&quot;https://gohugo.io/&quot;&gt;Hugo&lt;/a&gt;. There is nothing inherently bad with Hugo. It is
decently fast (written in Go) and it is easy to get going for a simple website.
It seems better than some alternatives like
&lt;a href=&quot;https://github.com/getpelican/pelican&quot;&gt;pelican&lt;/a&gt; which is written in Python and
thus will be slower to parse md files.&lt;/p&gt;
&lt;p&gt;However, Hugo doesn&#x27;t particularly appeal to me because the framework seems
too big and opinionated for what I need:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hugo takes data files, i18n bundles, configuration, templates for layouts,
static files, assets, and content written in Markdown, HTML, AsciiDoctor, or
Org-mode and renders a static website. Some notable features are multilingual
support, image processing, asset management, custom output formats, markdown
render hooks and shortcodes. Nested sections allow for different types of
content to be separated, e.g. for a website containing a blog and a podcast.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;http://web.archive.org/web/20240808001552/https://en.wikipedia.org/wiki/Hugo_(software)&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There&#x27;s a lot in there, so much so that big websites that need a lot of
features (e.g. Smashing Magazine) are capable of relying heavily on Hugo.&lt;/p&gt;
&lt;p&gt;Also as much as Hugo looks satisfiable today, I&#x27;m not expecting that it won&#x27;t
keep growing and changing in ways that would make me have to keep up with it
every now and then.&lt;/p&gt;
&lt;p&gt;I just need a parser that performs a one-off job of parsing the given markdown
file. There is no benefit of bringing a GC-based language into this type of
problem.&lt;/p&gt;
&lt;p&gt;I also wanted my website to use tech that I know will continue to work in the
upcoming decades (my last requirement). There is virtually nothing that beats C
compilers in that area as of today. For any new platform out there the first
thing that needs to happen is getting a C compiler built along with the
standard library (which is probably the only standard lib of popular
programming languages that fits in a &lt;a href=&quot;https://www.google.com/search?q=C+standard+library+P.J+Plauger&quot;&gt;commented book of 500
pages&lt;/a&gt;...).
Otherwise, nothing can run in the platform. So I&#x27;m hoping this bet will
pay off.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-08-26T00:00:00Z</published><updated>2024-08-26T00:00:00Z</updated></entry><entry><title>The Dumbest Compiler Imaginable</title><link href="https://www.marcelofern.com/posts/python/the-dumbest-compiler-imaginable/index.html"/><id>tag:marcelofern.com,2024-08-20:/python/the-dumbest-compiler-imaginable/index.html</id><content type="html">&lt;h1&gt;The Dumbest Compiler Imaginable&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-08-20
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;Python is about having the simplest, dumbest compiler imaginable, and the
official runtime semantics actively discourage cleverness in the compiler
like parallelizing loops or turning recursions into loops.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;− Guido van Rossum (creator of Python). &lt;a href=&quot;https://books.google.co.nz/books?id=bIxWAgAAQBAJ&amp;amp;pg=PA26&amp;amp;lpg=PA26&amp;amp;dq=%22Python+is+about+having+the+simplest,+dumbest+compiler+imaginable.%22&amp;amp;source=bl&amp;amp;ots=2OfDoWX321&amp;amp;sig=ACfU3U32jKZBE3VkJ0gvkKbxRRgD0bnoRg&amp;amp;hl=en&amp;amp;sa=X&amp;amp;redir_esc=y#v=onepage&amp;amp;q=%22Python%20is%20about%20having%20the%20simplest%2C%20dumbest%20compiler%20imaginable.%22&amp;amp;f=false&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This might be a shocking statement to read for someone used to languages that
compile to machine code directly like C, C++, Zig, Rust, etc.&lt;/p&gt;
&lt;p&gt;The compiler is, usually, a major source of optimisation for human-written
code, being capable of lexically analysing code and removing unnecessary
computation.&lt;/p&gt;
&lt;p&gt;This basically allows developers to write whatever code they think is the most
readable and expressive, while handing over the work of optimising the actual
code to the compiler.&lt;/p&gt;
&lt;p&gt;A classic example in C is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;int foo() {
  int bar1 = 42; // Unused variable.
  int bar2 = 100;
  return bar2;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which when compiled with the most basic level of optimisation via &lt;code&gt;gcc -O1&lt;/code&gt;
produces the following assembly code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;foo:
  mov  eax, 100
  ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The function just returns the value 100. There is no use of the stack to
store values, no load operation to fetch variables, and no other form of
allocations. The compiler is able to reason that the code returns 100 and
thus creates the machine code to do just that.&lt;/p&gt;
&lt;p&gt;Compare this with the Python function below:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def foo():
    bar1 = 42
    bar2 = 100
    return bar2
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which returns the following bytecode via &lt;code&gt;dis.dis(foo)&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LOAD_CONST               1 (42)
STORE_FAST               0 (bar1)
LOAD_CONST               2 (100)
STORE_FAST               1 (bar2)
LOAD_FAST                1 (bar2)
RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;bar1&lt;/code&gt; variable hasn&#x27;t been discarded, even though it hasn&#x27;t been used by
the code. A full description of Python opcodes is provided in the official
documentation
&lt;a href=&quot;http://web.archive.org/web/20240820032050/https://docs.python.org/3/library/dis.html&quot;&gt;(source)&lt;/a&gt;,
but basically the instructions above do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;LOAD_CONST&lt;/strong&gt;: Pushes the value 42 onto the stack. The big switch case in
CPython uses this underlying C code:&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;        TARGET(LOAD_CONST) {
            frame-&amp;gt;instr_ptr = next_instr;
            next_instr += 1;
            INSTRUCTION_STATS(LOAD_CONST);
            _PyStackRef value;
            value = PyStackRef_FromPyObjectNew(GETITEM(FRAME_CO_CONSTS, oparg));
            stack_pointer[0] = value;
            stack_pointer += 1;
            assert(WITHIN_STACK_BOUNDS());
            DISPATCH();
        }
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;STORE_FAST&lt;/strong&gt;: Pops the last value of the stack (42) into the local variable
(bar1).&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;      TARGET(STORE_FAST) {
          frame-&amp;gt;instr_ptr = next_instr;
          next_instr += 1;
          INSTRUCTION_STATS(STORE_FAST);
          _PyStackRef value;
          value = stack_pointer[-1];
          SETLOCAL(oparg, value);
          stack_pointer += -1;
          assert(WITHIN_STACK_BOUNDS());
          DISPATCH();
      }
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LOAD_FAST&lt;/strong&gt;: Pushes a reference to bar2 onto the stack.&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;      TARGET(LOAD_FAST) {
          frame-&amp;gt;instr_ptr = next_instr;
          next_instr += 1;
          INSTRUCTION_STATS(LOAD_FAST);
          _PyStackRef value;
          assert(!PyStackRef_IsNull(GETLOCAL(oparg)));
          value = PyStackRef_DUP(GETLOCAL(oparg));
          stack_pointer[0] = value;
          stack_pointer += 1;
          assert(WITHIN_STACK_BOUNDS());
          DISPATCH();
      }
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RETURN_VALUE&lt;/strong&gt;: pops the stack, returning the value (bar2) back to the
caller. (C code is too long to add here).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that those operations aren&#x27;t light on CPU time nor on store/loads. But,
that is Python, so there&#x27;s no much we can do there.&lt;/p&gt;
&lt;p&gt;As comparison, this is what a function returning 100 would do:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def foo():
    return 100
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;RETURN_CONST             1 (100)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In theory, a bytecode optimiser could generate the disassembled version of our
code above once it analysed the function to always return the same constant
value. This type of optimisation is called &lt;a href=&quot;http://web.archive.org/web/20240820050751/https://en.wikipedia.org/wiki/Peephole_optimization&quot;&gt;Peephole
Optimisation&lt;/a&gt;,
a term coined back in 1965.&lt;/p&gt;
&lt;p&gt;The problem of optimising the snippet above seems simple, but one immediate
problem is Python&#x27;s dynamic-typed nature. It isn&#x27;t trivial to know whether an
operation has been overloaded or not, but if we could tell the compiler
&amp;quot;I haven&#x27;t overloaded anything for these classes&amp;quot; we could possibly have some
nice optimisations.&lt;/p&gt;
&lt;p&gt;An &lt;a href=&quot;https://legacy.python.org/workshops/1998-11/proceedings/papers/montanaro/montanaro.html&quot;&gt;interesting paper&lt;/a&gt; from 1998 has a lot to say about optimising Python bytecode.&lt;/p&gt;
&lt;p&gt;For example, one snippet provided in the paper above talks about a common
unpacking pattern:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;a,b,c = 1,2,3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which in current versions of Python generates the bytecode:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LOAD_CONST               0 ((1, 2, 3))
UNPACK_SEQUENCE          3
STORE_NAME               0 (a)
STORE_NAME               1 (b)
STORE_NAME               2 (c)
RETURN_CONST             1 (None)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which is less desirable than the code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;a = 1
b = 2
c = 3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which skips the tuple allocation and unpacking overhead, dealing with variables
directly stored in the stack:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LOAD_CONST               0 (1)
STORE_NAME               0 (a)
LOAD_CONST               1 (2)
STORE_NAME               1 (b)
LOAD_CONST               2 (3)
STORE_NAME               2 (c)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Interestingly, Python tuples are immutable, and thus can be loaded as
constants. That&#x27;s why the LOAD_CONST opcode managed to load the three values
in one chunk.&lt;/p&gt;
&lt;p&gt;There are several other examples of optimisation opportunities, but the key
question is: &lt;em&gt;Why isn&#x27;t CPython already doing those&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;Guido&#x27;s quote provided at the top of this post isn&#x27;t sufficient to elucidate
what perks we are getting from having an unoptimised bytecode compiler. One
obvious one is maintainability. A dumb compiler is way easier to maintain and
change than a smart compiler that optimises a lot.&lt;/p&gt;
&lt;p&gt;I know that compiler experts will say that provided the right compiler
architecture, pattern-matching optimisation becomes easy as such optimisations
can be injected as &amp;quot;plugins&amp;quot;. Since I am not a compiler expert I have no
way to validate this idea. All I have is Guido&#x27;s quote.&lt;/p&gt;
&lt;p&gt;I am not particularly found on having simplicity on the compiler at the expense
of all Python programs being slowed down because of it. But maybe if Python had
an optimiser for bytecode, Python wouldn&#x27;t have existed in the first place.&lt;/p&gt;
&lt;p&gt;It seems like the tides have been changing though as Python 3.13 will get a
&lt;a href=&quot;http://web.archive.org/web/20240718182110/https://tonybaloney.github.io/posts/python-gets-a-jit.html&quot;&gt;JIT&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Another perk from a dumb compiler is debugging. Written Python code directly
translates into bytecode, and thus we can do this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def foo():
    a = 10
    breakpoint()
    b = 20
    return b
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the process of debugging we know that the variable &lt;code&gt;a&lt;/code&gt; is there and won&#x27;t
be optimised away. However... This is a bit of a straw man argument as we
could generate the bytecode with different levels of optimisation as in
&lt;code&gt;gcc&lt;/code&gt;&#x27;s -O1, -O2, and -O3, thus using a lower level of optimisation when
debugging.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Pypy&lt;/h2&gt;
&lt;p&gt;Pypy only has a couple of special bytecodes on top of what CPython already has,
and Pypy in general doesn&#x27;t perform a lot of bytecode optimisations either.&lt;/p&gt;
&lt;p&gt;But there are two interesting opcodes that make a significant difference on a
program&#x27;s performance. For the following code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-py&quot;&gt;class Foo:
    def bar(self, x: int, y: int) -&amp;gt; int:
        return x + y


foo = Foo()
x = 1
y = 2
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Running this function call on the CPython interpreter:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;# Runs foo.bar(x, y) bytecode.
dis.dis(lambda: foo.bar(x, y))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Gives me the bytecode:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LOAD_GLOBAL              0 (foo)
LOAD_ATTR                3 (NULL|self + bar)
LOAD_GLOBAL              4 (x)
LOAD_GLOBAL              6 (y)
CALL                     2
RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Whereas in Pypy we have:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LOAD_GLOBAL              0 (foo)
LOAD_METHOD              1 (bar)
LOAD_GLOBAL              2 (x)
LOAD_GLOBAL              3 (y)
CALL_METHOD              2
RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Both have 6 bytecode instructions each, but the first difference is that Pypy
uses its special &lt;code&gt;LOAD_METHOD&lt;/code&gt; opcode instead of a &lt;code&gt;LOAD_ATTR&lt;/code&gt; instruction.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;LOAD_METHOD&lt;/code&gt; pushes two values to the stack instead of one. It passes the
unbounded Python function object (Foo.bar) and the object itself (foo).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;CALL_METHOD&lt;/code&gt; (which received a parameter of N = 2) will pop the two
variables from the stack (x, y) as well as the &amp;quot;self&amp;quot; argument for the
unbounded function (&amp;quot;foo&amp;quot;) and call &lt;code&gt;Foo.bar(self, x, y)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To understand why this optimises the underlying code one must understand
the difference between bound and unbound methods in Python.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;# Using the object.
print(foo.bar)
&amp;gt;&amp;gt; &amp;lt;bound method Foo.bar of &amp;lt;__main__.Foo object at 0x7646b4648bf0&amp;gt;&amp;gt;

# Using the class.
print(Foo.bar)
&amp;gt;&amp;gt; &amp;lt;function Foo.bar at 0x7646b46972e0&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first case where the function &lt;code&gt;bar&lt;/code&gt; comes from an instantiated object is
called &amp;quot;bounded&amp;quot;, because that function is linked to the object and thus has a
&amp;quot;self&amp;quot; variable bounded to it.&lt;/p&gt;
&lt;p&gt;The second case, where we use the &lt;code&gt;Foo&lt;/code&gt; class, the function &lt;code&gt;bar&lt;/code&gt; is not
bounded as it didn&#x27;t come from an instantiated object and thus does not have a
&lt;code&gt;self&lt;/code&gt; object. Trying to call it will result in error:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&amp;gt;&amp;gt;&amp;gt; Foo.bar(x=1, y=2)
Traceback (most recent call last):
  File &amp;quot;&amp;lt;stdin&amp;gt;&amp;quot;, line 1, in &amp;lt;module&amp;gt;
TypeError: Foo.bar() missing 1 required positional argument: &#x27;self&#x27;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, you can do this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;&amp;gt;&amp;gt;&amp;gt; Foo.bar(self=foo, x=1, y=2)
3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So &lt;strong&gt;why is the original LOAD_ATTR&lt;/strong&gt; instruction a problem?&lt;/p&gt;
&lt;p&gt;The problem comes from the performance penalty imposed by creating bounded
methods and calling them.&lt;/p&gt;
&lt;p&gt;CPython creates bounded methods on demand. I.e., every time the bounded method
is needed for the first time, it is initialised and allocated in memory right
there.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; obj_1 = Foo()
&amp;gt;&amp;gt;&amp;gt; obj_2 = Foo()
&amp;gt;&amp;gt;&amp;gt;
&amp;gt;&amp;gt;&amp;gt; obj_1.bar is obj_2.bar
False
&amp;gt;&amp;gt;&amp;gt; obj_1.bar == obj_2.bar
False
&amp;gt;&amp;gt;&amp;gt; obj_1.bar
&amp;lt;bound method Foo.bar of &amp;lt;__main__.Foo object at 0x7646b49ebf80&amp;gt;&amp;gt;
&amp;gt;&amp;gt;&amp;gt; obj_2.bar
&amp;lt;bound method Foo.bar of &amp;lt;__main__.Foo object at 0x7646b49e8320&amp;gt;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note how the addresses of &lt;code&gt;obj_1.bar&lt;/code&gt; and &lt;code&gt;obj_2.bar&lt;/code&gt; are different. CPython
will create instances of those bound methods for each object before it can call
the bounded &lt;code&gt;.bar&lt;/code&gt; function (allocation on demand). However, Pypy will use the
stack to cache the unbounded method, and call it with the &amp;quot;self&amp;quot; object that is
stored in the stack already, so that there is no overhead of allocation and
creation of bounded methods when an object function needs to be called. It
operates similarly to &lt;code&gt;Foo.bar(self=obj, x=1, y=2)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This strategy provides a considerable performance improvement for heavily OOP
programs. According to Pypy:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Another optimization, or rather set of optimizations, that has a uniformly
good effect are the two ‘method optimizations’, i.e. the method cache and the
LOOKUP_METHOD and CALL_METHOD opcodes. On a heavily object-oriented benchmark
(richards) they combine to give a speed-up of nearly 50%, and even on the
extremely un-object-oriented pystone benchmark, the improvement is over 20%.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;http://web.archive.org/web/20240222082516/https://doc.pypy.org/en/latest/interpreter-optimizations.html&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Outro&lt;/h2&gt;
&lt;p&gt;It is important to note that there have been many attempts to make Python
faster, many of which have failed [&lt;a href=&quot;https://peps.python.org/pep-3146/&quot;&gt;1&lt;/a&gt;]
[&lt;a href=&quot;https://github.com/pyston/pyston&quot;&gt;2&lt;/a&gt;].&lt;/p&gt;
&lt;p&gt;As much as it would be nice to have another Python interpreter fully JIT&#x27;ed and
full of bytecode optimisations, in reality it is really hard to compete against
CPython. Many 3rd-party libraries use CPython&#x27;s C-extensions directly, which
aren&#x27;t necessarily available in other Python interpreters (excluding some
forks), rendering such libraries unusable.&lt;/p&gt;
&lt;p&gt;It might be too far to say that Python is a mono-implementation language, but
it does feel like it. If a fork is successful it may be merged up-stream
instead of remaining a fork. If the interpreter itself is built without
CPython&#x27;s C-extensions in mind, it will not provide a rich ecosystem for all
the performance-dependent 3rd-party libs out there and will thus probably be
less used.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-08-20T00:00:00Z</published><updated>2024-08-20T00:00:00Z</updated></entry><entry><title>Managing Python Environments</title><link href="https://www.marcelofern.com/posts/python/managing-python-environments/index.html"/><id>tag:marcelofern.com,2024-08-06:/python/managing-python-environments/index.html</id><content type="html">&lt;h1&gt;Managing Python Environments&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-08-06
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The ecosystem for managing Python environments is huge, and so is the number
of tools that are used to manage these environments.&lt;/p&gt;
&lt;p&gt;We have: &lt;code&gt;pyenv&lt;/code&gt;, &lt;code&gt;virtualenv&lt;/code&gt;, &lt;code&gt;virtualenvwrapper&lt;/code&gt;, &lt;code&gt;asdf&lt;/code&gt;, &lt;code&gt;conda&lt;/code&gt;,
&lt;code&gt;anaconda&lt;/code&gt;, &lt;code&gt;uv&lt;/code&gt;, &lt;code&gt;poetry&lt;/code&gt;, &lt;code&gt;pipenv&lt;/code&gt; etc.&lt;/p&gt;
&lt;p&gt;It is very easy to break your local environment if you are new to all of this.
This cartoon from xkcd sums it up well:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;python_xkcd.png&quot; alt=&quot;python_xkcd&quot;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://web.archive.org/web/20240716100839/https://xkcd.com/1987/&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The purpose of this post is to argue whether we need any of these tools when
working on multiple Python projects that potentially have incompatible Python
versions and dependencies.&lt;/p&gt;
&lt;p&gt;Mind you that I program for work. Some of the knowledge and intentions behind
the way I do things have been learned through trial and error over time. What
is written here assumes that you either have a similar background or the
ability to understand or come to common ground on why some of those decisions
are harder to me than others.&lt;/p&gt;
&lt;p&gt;Also, I expect that you don&#x27;t need convincing that using your global Python
executable for everything is a bad idea, and that isolated virtual environments
for each project is the best solution to avoid dependency headaches.&lt;/p&gt;
&lt;h2&gt;What Do These Tools Provide?&lt;/h2&gt;
&lt;p&gt;Every tool I cited above is a little bit different, but I will pick &lt;code&gt;pyenv&lt;/code&gt; as
an example since it is one of the tools with the smallest footprint.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pyenv&lt;/code&gt; let&#x27;s you:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Install a new Python version: &lt;code&gt;pyenv install 3.10.4&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Let pyenv automatically pick a Python environment when you &lt;code&gt;cd&lt;/code&gt; into a
folder: &lt;code&gt;pyenv local &amp;lt;version&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;A plugin framework that let&#x27;s you add &lt;code&gt;virtualenv&lt;/code&gt; among other tools.&lt;/li&gt;
&lt;li&gt;Auto completion of commands.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This all sounds pretty neat, but is it worth installing this tool made of
101,263 lines of code and data files (as per v2.4.9) just so that you have
these commands plus a plugin framework?&lt;/p&gt;
&lt;p&gt;My answer is no. You are not going to need it and you are better off with
the default tools (more on that later).&lt;/p&gt;
&lt;p&gt;There&#x27;re three main points that I consider undesired behaviour coming from
the abstraction provided by &lt;code&gt;pyenv&lt;/code&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Bash shims. A shim is merely a proxy. If you add the &lt;code&gt;pyenv&lt;/code&gt; collection of
shims at the beginning of your $PATH variable, as in:&lt;pre&gt;&lt;code&gt;$(pyenv root)/shims:/usr/local/bin:/usr/bin:/bin`
&lt;/code&gt;&lt;/pre&gt;
these shims will intercept Python commands like &lt;code&gt;pip&lt;/code&gt; so that the correct
virtualised &lt;code&gt;pip&lt;/code&gt; version for the directory you previously &lt;code&gt;cd&lt;/code&gt; into is
chosen. Effectively, these shim commands replace your Python environment
commands like &lt;code&gt;pip&lt;/code&gt; by the commands hardcoded by &lt;code&gt;pyenv&lt;/code&gt; that do the magic
for you. But besides the magic nature of those shims, they can be very slow
(&lt;a href=&quot;https://github.com/pyenv/pyenv/issues/2802&quot;&gt;example&lt;/a&gt;). If you have many
Python projects, these shims start to become a bit of a dark magic and you
won&#x27;t have direct access to the Python tools if anything bad happens. Plus
if pyenv adds an order of magnitude of slowness as compared to running the
Python binary itself, pyenv becomes a painful tool to use.&lt;/li&gt;
&lt;li&gt;Python versions are hidden from you, which is another magical feature that
makes it a little less ergonomic for you to control or debug a particular
environment yourself. This problem can be enhanced when the bug is in
&lt;code&gt;pyenv&lt;/code&gt; itself. Checking for &lt;a href=&quot;https://github.com/pyenv/pyenv/issues?q=is%3Aissue+is%3Aclosed&quot;&gt;recent
issues&lt;/a&gt; in
the repository, one can see many distinct problems ranging from incompatible
changes within &lt;code&gt;pyenv&lt;/code&gt; itself, to weird missing C++ links in the Python
executable, failing to create a virtual environment for a specific version
of Python, unavailable or unsupported Python binary, operating system
upgrades breaking the tool, etc. There are 1,700+ issues to date to pick
from.&lt;/li&gt;
&lt;li&gt;So much bash. Assuming you are one of these people in the issues page that
need support, jumping into the source code isn&#x27;t trivial. Almost half of the
repo is composed of bash scripts. That&#x27;s about 50,000 lines of bash code
according to Github. I like bash for small scripts, specially for my own.
Debugging thousands of lines of someone else&#x27;s bash is a much harder
problem.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After reading all that, you might still find that &lt;code&gt;pyenv&lt;/code&gt; is actually useful
for you and the drawbacks aren&#x27;t that meaningful. If that is the case, please
go for it! If &lt;code&gt;pyenv&lt;/code&gt; wasn&#x27;t useful it wouldn&#x27;t be so popular. But developers
come in different flavours, and given past experience I can tell that &lt;code&gt;pyenv&lt;/code&gt;
isn&#x27;t for me.&lt;/p&gt;
&lt;p&gt;I personally am not a big fan of magical tools and I like to have control and
understanding of how to fundamentally control my work environment as this is
an important part of my job. Any breakage in my local environment in the past
has caused me great pain and stress. Most of these problems have been caused
by mismanagement of dependencies; problems either created by me (lack of
knowledge of how underlying tools work), by the Operating System (ubuntu and
MacOS in particular), or by magic tools changing in backwards incompatible
ways.&lt;/p&gt;
&lt;p&gt;On the other hand, I have frequently been surprised by how easy it is to learn
and use basic tools available by the OS or the programming language itself,
which has only added to my scepticism of magical tools adding value in
exchange for their added cognitive load and potential bugs.&lt;/p&gt;
&lt;p&gt;I also mentioned at the top of this section that &lt;code&gt;pyenv&lt;/code&gt; is one of the tools
with the smallest footprint. That is true. Other tools such as &lt;code&gt;conda&lt;/code&gt;, &lt;code&gt;asdf&lt;/code&gt;
and, heck, &lt;code&gt;nix&lt;/code&gt; are on a higher level of abstraction. To me, they are even
less desirable for the task of managing Python environments locally.&lt;/p&gt;
&lt;p&gt;There are also other caveats with these tools such as the fact that they
change, grow bigger, and sometimes these changes create backwards
incompatibility with their own earlier versions as we saw above with &lt;code&gt;pyenv&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It is not hard to find issues on those repositories where some conflicting
dependency has broken the dependency resolver tool itself
[&lt;a href=&quot;https://github.com/conda/conda/issues/12325&quot;&gt;1&lt;/a&gt;]. If you are in a situation
where you need a version management tool to manage your version management
tool, things get complicated. It is a fact that software breaks, and if your
environment management that is build upon high levels of abstraction has failed
you, how will you fix this issue without knowing enough about this 100,000
lines code repository?&lt;/p&gt;
&lt;h2&gt;So Why Do People Use These Tools?&lt;/h2&gt;
&lt;p&gt;I can only speculate on empirical knowledge since I don&#x27;t have any hard data I
can reach to, so take that with a grain of salt.&lt;/p&gt;
&lt;p&gt;I imagine that whether someone will choose to use an environment manager tool
comes down to their background. Preferring to pick a tool over another is a
choice compounded by many factors:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;How junior or senior a developer is. Being a junior developer generally
means that there are many pressing things to learn at once: The programming
language, the specific auxiliary technology ubiquitous in their areas, the
product of the business they are working for, text editors, frameworks,
developer hype, etc. It is totally understandable that when it comes to
understanding tools for managing an environment, spending time to analyse
all choices and select the best one is lower in their list of priorities.
They will pick the one that magically handles everything for them so that
they can move on. I have done that myself many times in many different
problem areas. Magic isn&#x27;t by itself a bad thing, but I think that as one
progresses to more senior levels and becomes interested in particular
topics, it is important to materialise current knowledge and evolve it into
deep knowledge about how things work, and to make an effort to help the
community simplify things if all possible.&lt;/li&gt;
&lt;li&gt;How much they care about their environment being deterministic at all times.
If a developer only works in a single codebase and the requirements don&#x27;t
change often, why care about managing environments at all? This is a bit
of a moot point, but I know that developers who come from projects like this
have a hard time when they get a job at a company that has several codebases
with different tools and requirements for each and struggle to understand
or care about this type of problem.&lt;/li&gt;
&lt;li&gt;Popularity of a given tool. There is trust that popular projects will be
stable enough and have a community of people backing it up. Trusting that
X tool is the tool that professionals in the field use to solve their
problems, so it must be good.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What do I do then?&lt;/h2&gt;
&lt;p&gt;I am writing this article in 2024. Building Python from source is incredibly
easy yet surprisingly very few people actually do it. Yes... Building
from source! What a crazy idea, nobody builds from source these days and many
people don&#x27;t know how to.&lt;/p&gt;
&lt;p&gt;It is possible to download a specific Python version and set up a virtual
environment using Python&#x27;s own
&lt;a href=&quot;http://web.archive.org/web/20240731142400/https://docs.python.org/3/library/venv.html&quot;&gt;venv&lt;/a&gt;
tool without any extra dependency whatsoever.&lt;/p&gt;
&lt;p&gt;Here&#x27;s a short list of bash commands that download Python 3.11.5 and set a
virtual environment for it:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# You&#x27;ll be installing your Python binaries at $HOME/python_bin.
mkdir -p $HOME/.python_bin/ &amp;amp;&amp;amp; cd $HOME/.python_bin/

# Download the tar for the Python version you want.
curl -O https://www.python.org/ftp/python/3.11.5/Python-3.11.5.tgz

# Decompress and install it.
tar -xzf Python-3.11.5.tgz &amp;amp;&amp;amp; cd Python-3.11.5
./configure --prefix=/tmp/localpython/3.11.5 &amp;amp;&amp;amp; make &amp;amp;&amp;amp; make install

# Create your environment anywhere you like.
./$HOME/python_bin/Python-3.11.5/python -m venv my_env
source my_env/bin/activate
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That is it. Now you know how the whole process works (it is so easy) and you&#x27;re
using &lt;code&gt;venv&lt;/code&gt; which was introduced to core Python in 3.3+. You can also play
with compilation flags and build the binary with some extensions (but you don&#x27;t
have to!).&lt;/p&gt;
&lt;p&gt;Of course this is still using some tools that abstract the burden of building
the binary for the project. If you have never built a big C project like
CPython before, you might be asking yourself what is this &lt;code&gt;./configure&lt;/code&gt; script,
what is &lt;code&gt;make&lt;/code&gt; and so on so forth. In a nutshell, this is how binaries are
usually packaged - at some level either you are doing this or your operating
system has come up with a standardised way to build from source for you via
a package manager.&lt;/p&gt;
&lt;p&gt;So now you can run however many virtual environments you want from that binary,
and put them anywhere you like. If you want extra convenience to activate that
environment for a particular project, just create an alias:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;alias myproj=&amp;quot;/somewhere/my_env/bin/activate &amp;amp;&amp;amp; cd /somewhere/myproj&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you aren&#x27;t using Python 3.3+, just swap &lt;code&gt;venv&lt;/code&gt; for anything else that works
for your version, or heck, just directly use that Python executable for your
project - it is totally disposable and you can download another one any time
you like. Now that you know how the process works, it is very easy to change it
to your taste, and that&#x27;s exactly what I wanted to show in this post.&lt;/p&gt;
&lt;p&gt;If you want some further ideas, this is the script I am using on my &lt;code&gt;bashrc&lt;/code&gt;
file.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;install_python_version() {
  # call this function with a version of Python
  # like `install_python_version 3.11.9`.

  # Clean up first
  rm -rf /tmp/python-install

  # This is where the different Python executables will be installed.
  DIR=$HOME/.python_bin/python-$1
  mkdir -p $DIR

  # This is where temporary installation files will be available.
  mkdir -p /tmp/python-install &amp;amp;&amp;amp; cd /tmp/python-install

  # Download the python version
  curl -O https://www.python.org/ftp/python/$1/Python-$1.tgz

  tar -xzf Python-$1.tgz &amp;amp;&amp;amp; cd /tmp/python-install/Python-$1
  ./configure --prefix=$DIR &amp;amp;&amp;amp; make &amp;amp;&amp;amp; make install

  echo &amp;quot;Now you can install your virtualenv:&amp;quot;
  echo &amp;quot;$HOME/.python_bin/python-$1/bin/python3 -m venv /tmp/my_env&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can invoke it from the shell with &lt;code&gt;install_python_version 3.11.5&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;But If Building From Source Was Good Package Managers Wouldn&#x27;t Exist...&lt;/h2&gt;
&lt;p&gt;While this is generally true, and package managers are incredibly useful tools,
I think that it is worth picking a battle now and then and building something
from source when it makes sense to do so. I think that at a minimum, being
comfortable building your &lt;strong&gt;main&lt;/strong&gt; tools plus other tools that are notorious
for having conflicting versions from source is a good general advice.&lt;/p&gt;
&lt;p&gt;In my case, I rely on my OS package manager a lot for my secondary tools. But
even though pacman is a great package manager, it is not without its drawbacks.
It only builds dependencies with the default flags. If I need more
customisation, I have to step out of the manager or understand how the manager
works so that I can apply the particular building flags I want.&lt;/p&gt;
&lt;p&gt;This is also a problem in a rolling release system like Arch Linux, as
installing multiple versions of the same dependency will point you towards some
form of virtualisation (using docker, for example) or building from source.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-08-06T00:00:00Z</published><updated>2024-08-06T00:00:00Z</updated></entry><entry><title>Which Assembly Syntax to Choose?</title><link href="https://www.marcelofern.com/posts/asm/att-vs-intel-syntax/index.html"/><id>tag:marcelofern.com,2024-07-24:/asm/att-vs-intel-syntax/index.html</id><content type="html">&lt;h1&gt;Which Assembly Syntax to Choose?&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-07-24
Updated at: 2024-08-13
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: Use the Intel syntax, but AT&amp;amp;T isn&#x27;t that bad.&lt;/p&gt;
&lt;p&gt;I usually prefer not to post content that is already easily searchable on the
internet. But the problem is, the really great information on this topic seems
to be distributed across just a few different places which sometimes are tricky
to find and often not argumentative enough to prescribe a syntax
recommendation.&lt;/p&gt;
&lt;p&gt;That means that when I eventually forget why I picked one versus another, I
have to scramble across various posts to figure out which syntax to use for a
new project.&lt;/p&gt;
&lt;p&gt;Top results on Google don&#x27;t help much as many link to Reddit threads. Due to
the nature of Reddit, the arguments are rare or non-existent.&lt;/p&gt;
&lt;p&gt;As you already figured out from the TLDR at the top, I prefer the Intel syntax.
I think that a good approach is to be contrarian and start with the differences
that seem to make the Intel Syntax &lt;strong&gt;look less desirable&lt;/strong&gt;. I am a fan of
honest downsides being up front, and I think it makes an article more honest.
So here we go.&lt;/p&gt;
&lt;h2&gt;Order of Operands&lt;/h2&gt;
&lt;p&gt;In the Intel syntax, the first operand is the destination and the second
operand is the source, whereas in AT&amp;amp;T it is the opposite. This is just about
the most confusing thing when you are comparing AT&amp;amp;T assembly with Intel
assembly.&lt;/p&gt;
&lt;p&gt;If you don&#x27;t read assembly often, it is easy to forget which order each syntax
uses.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;| Intel         | AT&amp;amp;T             |
| --------------|------------------|
| mov rax, 0xFF | movq $0xFF, %rax |
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I prefer the AT&amp;amp;T syntax here because it flows better in English. E.g. &amp;quot;Move
the value 0xFF &lt;strong&gt;into&lt;/strong&gt; rax&amp;quot;.&lt;/p&gt;
&lt;p&gt;The counter argument here for some people is that they still prefer the Intel
syntax in this case because it reads like C:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;mov rax, rdx        ; rax = rdx
sub rbx, rdi        ; rbx -= rdi
shlx rax, rbx, rdi  ; rax = rbx &amp;lt;&amp;lt; rdi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If that mode of thinking fits your brain well you probably won&#x27;t see that as a
problem. For me, I always have to &amp;quot;reverse think&amp;quot;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update (2024-08-13)&lt;/strong&gt;: There is another counter argument. I&#x27;ve come to
realise that ABI rules favour the Intel syntax. So for example the function:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;long sum(long foo, long bar);
// foo -&amp;gt; %rdi
// bar -&amp;gt; %rsi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;foo&lt;/code&gt; is stored in rdi (&amp;quot;d&amp;quot; standing for destination), and &lt;code&gt;bar&lt;/code&gt; is stored in
rsi (&amp;quot;s&amp;quot; standing for source). The convention is to have the destination first
then the source, just like in Intel syntax.&lt;/p&gt;
&lt;h2&gt;AT&amp;amp;T is The Default on GCC, objdump, and GDB&lt;/h2&gt;
&lt;p&gt;This point isn&#x27;t about syntax at all, but I often find tooling characteristics
relevant when making an important choice and thus I can&#x27;t ignore them. I spend
a great deal of time inside gdb and also printing &lt;code&gt;objdump&lt;/code&gt;s and if there was a
major inconvenience about using a syntax that would put a damper on my using of
&lt;code&gt;gcc&lt;/code&gt;, &lt;code&gt;objdump&lt;/code&gt; and &lt;code&gt;gdb&lt;/code&gt;, I&#x27;d probably consider learning a new syntax.&lt;/p&gt;
&lt;p&gt;For historical reasons &lt;code&gt;GAS&lt;/code&gt; (the GNU disassembler that is a backend of GCC)
originally used the AT&amp;amp;T syntax. Support for Intel was only used later, and
naturally the default remained AT&amp;amp;T syntax.&lt;/p&gt;
&lt;p&gt;This can be changed by configurations, of course, so I have the following
line in my &lt;code&gt;~/.config/gdb/gdbinit&lt;/code&gt; file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set disassembly-flavor intel
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And when using &lt;code&gt;gcc&lt;/code&gt;&#x27;s disassembler I use the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;gcc -S -masm=intel
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And finally for &lt;code&gt;objdump&lt;/code&gt; I have to run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;objdump -Mintel
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This isn&#x27;t a problem on my local machine since I can use aliases. But on
another dev environment, or when someone is sharing some code from theirs, it
isn&#x27;t absurd to expect they&#x27;ll be using the defaults. This was a strong reason
for me to commit to learning both syntaxes well. I do have to spin my brain on
hyperthreaded mode to read AT&amp;amp;T syntax. Writing is a bit harder for me because
I keep forgetting the instruction suffixes, and the &lt;code&gt;%&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; signs as I&#x27;m
more used to writing Intel.&lt;/p&gt;
&lt;h2&gt;Comments&lt;/h2&gt;
&lt;p&gt;Intel syntax uses &lt;code&gt;;&lt;/code&gt; for comments. Whereas AT&amp;amp;T uses &lt;code&gt;#&lt;/code&gt; or C style comments.
I do have a slight preference for AT&amp;amp;T style here (C style comments!) but this
is the last point where I think AT&amp;amp;T syntax is better.&lt;/p&gt;
&lt;p&gt;Now the cons... I will follow course and start with the minor problems and go
up to bigger problems.&lt;/p&gt;
&lt;h2&gt;Suffixes&lt;/h2&gt;
&lt;p&gt;Many instructions require suffixes on AT&amp;amp;T when the size of operands matter:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;# AT&amp;amp;T operator suffixes
movb al, bl
movw ax, bx
movl eax, ebx
movq rax, rbx
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;b&lt;/code&gt; is for byte, &lt;code&gt;w&lt;/code&gt; is for word (16 bits), &lt;code&gt;l&lt;/code&gt; is for long-word (32 bits), and
&lt;code&gt;q&lt;/code&gt; is for quadword (64 bits).&lt;/p&gt;
&lt;p&gt;I don&#x27;t know why the 32bit length is called &amp;quot;long-word&amp;quot;. I imagine it&#x27;s because
it was added when 32 bits were seen as the limit and &amp;quot;long&amp;quot; made sense then.&lt;/p&gt;
&lt;p&gt;As soon as we got 64 bits &amp;quot;long&amp;quot; became a confusing word. Specially because C
has the &lt;code&gt;long&lt;/code&gt; keyword and on modern machines &lt;code&gt;sizeof(long)&lt;/code&gt; is 64 bits instead
of 32 bits. In Intel syntax this is called a &amp;quot;double word&amp;quot;, which in my opinion
is a much clearer nominator.&lt;/p&gt;
&lt;p&gt;This is a minor issue, you get used to it. In the Intel syntax you often don&#x27;t
need size specifiers because the operands give you this information implicitly:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;; because esi is 32bits, this is
; the equivalent of &amp;quot;movl&amp;quot; in AT&amp;amp;T
mov esi, 8
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However other operations in Intel syntax may &lt;em&gt;also&lt;/em&gt; require a suffix if the
operators alone aren&#x27;t sufficient to determine the size of the operation. For
example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;; how many bytes??
mov  [rbp-20], 20
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You are moving 20 to the address in memory calculated by the value &lt;code&gt;rbp-20&lt;/code&gt;
but how many bytes from the value &amp;quot;20&amp;quot; are you moving? You need to clarify:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;mov DWORD PTR [rbp-20], 20
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Prefixes&lt;/h2&gt;
&lt;p&gt;Both registers and immediate values have prefixes in AT&amp;amp;T syntax.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;# AT&amp;amp;T
movl $25, %rdi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The fact that Intel doesn&#x27;t use prefixes for registers and immediate values
already shows the reader that prefixes aren&#x27;t necessary.&lt;/p&gt;
&lt;p&gt;The only &amp;quot;downside&amp;quot; I can think of (and please reader correct me if I am
wrong), is that we can&#x27;t have symbols with register names in Intel i.e.,
&lt;code&gt;rax&lt;/code&gt; is not a valid symbol name.&lt;/p&gt;
&lt;p&gt;For example this code fails to compile:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;main:
  mov eax, ebx
  call ax
  ret
ax:
  mov bl, cl
  ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Changing &lt;code&gt;ax&lt;/code&gt; to something other than a register name will fix the code. This
may only be a problem when writing code manually. But note that if you are
overriding gcc defaults the following code blows up when running &lt;code&gt;gcc -masm=intel main.c&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;#include &amp;lt;stdio.h&amp;gt;

long rax(int a, int b) {
  return 32*a &amp;lt;&amp;lt; b;
}

int main() {
  long a;
  a = rax(42, 42);
  printf(&amp;quot;%ld&amp;quot;, a);
}
// Error:
// gcc -masm=intel main.c
// A.s: Assembler messages:
// Error: .size expression for rax does not evaluate to a constant
//
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That blows up because a symbol (the function named &lt;code&gt;rax&lt;/code&gt;) uses the name of a
register. Changing the name of the function to something else fixes the
problem.&lt;/p&gt;
&lt;h2&gt;Memory Operands&lt;/h2&gt;
&lt;p&gt;This is the biggest pain point of AT&amp;amp;T. Addressing memory scales.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Intel, AT&amp;amp;T
instr bar, [base+index*scale+disp], instr disp(base,index,scale),foo
add rax,[rbx+rcx*0x4-0x22], addq -0x22(%rbx,%rcx,0x4), %rax
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that displacements aren&#x27;t the same as immediate values and thus don&#x27;t
require a &lt;code&gt;$&lt;/code&gt; prefix. I&#x27;m sure some will think of it as an inconsistency.&lt;/p&gt;
&lt;p&gt;This is where everything packs together. The suffixes, prefixes, and a strange
way to calculate memory addresses. At least the form never changes, so once
you&#x27;re used the expression it becomes more familiar.&lt;/p&gt;
&lt;h2&gt;Final Remarks&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Is that all? Why! It doesn&#x27;t look so bad!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Well, it doesn&#x27;t look so bad because it isn&#x27;t &lt;em&gt;that&lt;/em&gt; bad! But also keep in mind
that I didn&#x27;t show you any long snippets of assembly code. Take a file with 200
lines of assembly and naturally the AT&amp;amp;T syntax will be more visually daunting.&lt;/p&gt;
&lt;p&gt;There are also other arguments I didn&#x27;t add here regarding documentation.
Intel manuals naturally use the Intel syntax, and there are plenty of Intel
manuals out there, so chances are you&#x27;ll be reading some. Also some of the
MCUs I&#x27;ve worked with on embedded systems follow a syntax that is closer to
Intel.&lt;/p&gt;
&lt;p&gt;If you are writing a new project in Assembly I&#x27;d recommend the Intel syntax.&lt;/p&gt;
&lt;p&gt;But considering that you will likely come across both when &lt;em&gt;reading&lt;/em&gt; code, my
recommendation is to learn both syntaxes, and if you don&#x27;t use assembly that
often just keep a cheatsheet handy so that you can quickly navigate between the
discrepancies.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-07-24T00:00:00Z</published><updated>2024-08-13T00:00:00Z</updated></entry><entry><title>mov edi, edi</title><link href="https://www.marcelofern.com/posts/asm/mov_edi_edi/index.html"/><id>tag:marcelofern.com,2024-06-08:/asm/mov_edi_edi/index.html</id><content type="html">&lt;h1&gt;mov edi, edi&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created at: 2024-06-08
Updated at: 2024-07-27
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I had a surprise today when I saw the instruction &lt;code&gt;mov edi, edi&lt;/code&gt; as the first
instruction of a function call.&lt;/p&gt;
&lt;p&gt;This is my C code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;unsigned int func(unsigned int idx) {
  static unsigned int my_table[] = {10, 20, 30, 40};
  return my_table[idx];
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which returned the following x86 assembly (compiled via gcc):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;func:
  mov  edi, edi
  lea  rax, my_table.0[rip]
  mov  eax, DWORD PTR [rax+rdi*4]
  ret
  .size  func, .-func
  .section  .rodata
  .align 16
  .type  my_table.0, @object
  .size  my_table.0, 16
my_table.0:
  .long  10
  .long  20
  .long  30
  .long  40
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code was compiled with the flag -O3, which I thought was going to
eliminate all useless instructions. To my surprise, when I removed all the
unsigned keywords from the function, the &lt;code&gt;mov edi, edi&lt;/code&gt; disappeared in favour
of a &lt;code&gt;movsx rdi, edi&lt;/code&gt;! Here&#x27;s the equivalent asm code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;func:
  movsx  rdi, edi
  lea  rax, my_table.0[rip]
  mov  eax, DWORD PTR [rax+rdi*4]
  ret
  .size  func, .-func
  .section  .rodata
  .align 16
  .type  my_table.0, @object
  .size  my_table.0, 16
my_table.0:
  .long  10
  .long  20
  .long  30
  .long  40
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I went on a spiral of research, and I found many links pointing to this
instruction being necessary in Microsoft Windows, so that the OS could operate
hot-patching. &lt;a href=&quot;http://web.archive.org/web/20240610022212/https://devblogs.microsoft.com/oldnewthing/20110921-00/?p=9583&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;However, I compiled this on Linux. This should not be relevant to me. Here is
the catch; that &lt;code&gt;mov edi, edi&lt;/code&gt; operation is used for zero&#x27;ing the most
significant 32 bits of the &lt;code&gt;rdi&lt;/code&gt; register.&lt;/p&gt;
&lt;p&gt;It does not seem obvious, but the answer can be found in the x86 tour of Intel
manuals &lt;a href=&quot;http://web.archive.org/web/20240610022212/http://web.archive.org/web/20240415061928/http://x86asm.net/articles/x86-64-tour-of-intel-manuals/&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;General-purpose Registers (...)&lt;/p&gt;
&lt;p&gt;32-bit operands generate a 32-bit result, zero-extended to a 64-bit result
in the destination general-purpose register. (...)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;UPDATE: In case it wasn&#x27;t clear from the quote above, the zero-extension only
works for 32 bit operands. If you run &lt;code&gt;mov di, di&lt;/code&gt; (di is 16 bits long), the
zero-extension &lt;strong&gt;will not happen&lt;/strong&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The zero-extension is indeed what happens when I try to run the following mock
assembly code below:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;main:
  ; Load `rdi` with all one&#x27;s.
  mov rdi, 0xFFFFFFFFFFFFFFFF
  ; After the instruction below,
  ; rdi will be 0x0000000011111111
  mov edi, edi
  ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This was not obvious to me at all. The initial instruction &lt;code&gt;mov edi, edi&lt;/code&gt; just
looked like a nop equivalent with two bytes...&lt;/p&gt;
&lt;p&gt;Coming back to my original function:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asm&quot;&gt;unsigned int func(unsigned int idx) {
  static unsigned int my_table[] = {10, 20, 30, 40};
  return my_table[idx];
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since I am using unsigned integers, the compiler can trust that the arguments
passed to that function in assembly won&#x27;t be more than 32 bits long in my
machine.&lt;/p&gt;
&lt;p&gt;UPDATE: The compiler actually doesn&#x27;t need to &amp;quot;trust&amp;quot; anything, it actually
does not matter. The &lt;code&gt;movsx&lt;/code&gt; instruction accepts operands of different sizes.
This means that the 32 bits in &lt;code&gt;edi&lt;/code&gt; will be moved with sign-extension to fit
the 64 bits of &lt;code&gt;rdi&lt;/code&gt;. The underlying 32 bit value will remain the same, and it
doesn&#x27;t matter what bits were in the most-significant upper 32bits of &lt;code&gt;rdi&lt;/code&gt;
before the &lt;code&gt;mov&lt;/code&gt; operation - they will just be completely ignored. That is why
the instruction &lt;code&gt;mov edi, edi&lt;/code&gt; is not necessary beforehand!&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Remember that the ABI for C functions calls in assembly is that the first
argument to the function, in this case idx, will be passed in the register rdi.&lt;/p&gt;
&lt;p&gt;So this function is cleaning up the most significant bits of rdi for us. I am
still not totally sure why this is necessary, but perhaps the compiler assumes
that some garbage could be held in the most significant bits of rdi and tries
to clean that up first to avoid potential bugs.&lt;/p&gt;
&lt;p&gt;This assumption makes sense to me at first, because down in the assembly
function body, we rely on rdi for finding the address offset of the element in
the table that we want to return: &lt;code&gt;mov eax, DWORD PTR [rax+rdi*4]&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now remains the question: &amp;quot;Why is there an assumption by the compiler that rdi
can contain garbage in the most significant bits?&amp;quot;.&lt;/p&gt;
&lt;p&gt;This can happen if the function is being called with a &amp;quot;casted&amp;quot; value given
that casting per-se does not clean up unused bits of a 64bit register. That
could happen if a 64 bit integer was casted down to a 32 bit one.&lt;/p&gt;
&lt;p&gt;Again, this is very much based on my own understanding on how assembly works in
my platform, if you think that I got something wrong please send me an email at
marceelofernandes@gmail.com.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-06-08T00:00:00Z</published><updated>2024-07-27T00:00:00Z</updated></entry><entry><title>Goodbye ZSH</title><link href="https://www.marcelofern.com/posts/linux/goodbye_zsh/index.html"/><id>tag:marcelofern.com,2024-05-08:/linux/goodbye_zsh/index.html</id><content type="html">&lt;h1&gt;Goodbye ZSH&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created: 2024-05-08
Updated: 2024-07-06
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After 7 years using
&lt;a href=&quot;http://web.archive.org/web/20240503162424/https://www.zsh.org/&quot;&gt;zsh&lt;/a&gt; and
&lt;a href=&quot;http://web.archive.org/web/20240501165521/https://ohmyz.sh/&quot;&gt;oh-my-zsh&lt;/a&gt;, I&#x27;ve
completely ditched both of them today.&lt;/p&gt;
&lt;p&gt;I would like to state at the top that there isn&#x27;t anything inherently bad or
wrong with zsh and oh-my-zsh. It is just that these technologies don&#x27;t fit well
within my way of doing things, and have become unnecessary over time.&lt;/p&gt;
&lt;p&gt;There are many reasons for this, but I will start with the reasons for getting
rid of &lt;code&gt;oh-my-zsh&lt;/code&gt; first.&lt;/p&gt;
&lt;h2&gt;oh-my-zsh&lt;/h2&gt;
&lt;p&gt;One may think that oh-my-zsh is zsh itself, but that is not true.
oh-my-zsh is simply a &amp;quot;plugin manager&amp;quot; for zsh.&lt;/p&gt;
&lt;p&gt;The oh-my-zsh package promises wonders. From their website:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Oh My Zsh will not make you a 10x developer...but you may feel like one!&lt;/p&gt;
&lt;p&gt;Once installed, your terminal shell will become the talk of the town or your
money back! With each keystroke in your command prompt, you&#x27;ll take advantage
of the hundreds of powerful plugins and beautiful themes. Strangers will come
up to you in cafés and ask you, &amp;quot;that is amazing! are you some sort of
genius?&amp;quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;oh-my-zsh comes with hundreds of plugins pre-installed, many of which you will
never use or hear of, and that includes themes as well.&lt;/p&gt;
&lt;p&gt;Even though this isn&#x27;t a problem in itself, as those plugins are just text
files that will hang around in your system, they are still there when you
didn&#x27;t ask for them.&lt;/p&gt;
&lt;p&gt;This is something that I personally have been trying to reduce in my system as
the burden of maintenance rises with every package added.&lt;/p&gt;
&lt;p&gt;The less unused files, dependencies, libs, etc, the less risk there is of
something crashing, requiring updates, or being a security risk. This is
particularly relevant as on-my-zsh plugins are just a bunch of zsh shell
scripts.&lt;/p&gt;
&lt;p&gt;But this is just me preaching a particular philosophy. A more important
practical problem, is around the bash &lt;code&gt;aliases&lt;/code&gt; that oh-my-zsh brings with it.
Many of each are for applications you may not even have installed.&lt;/p&gt;
&lt;p&gt;For example, these are some of the aliases available with oh-my-zsh:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;alias help=&#x27;man&#x27;
alias _=&#x27;sudo &#x27;
alias :3=&#x27;echo&#x27;
alias dud=&#x27;du -d 1 -h&#x27;
alias drm=&#x27;docker container rm&#x27;
alias p=&#x27;ps -f&#x27;
alias rm=&#x27;rm -i&#x27;
alias ldot=&#x27;ls -ld .*&#x27;
alias lS=&#x27;ls -1FSsh&#x27;
alias hadat=&#x27;heroku addons:attach&#x27;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This might not be a problem unless you have a crashing alias. But still, did
you know you had all those aliases available? Are them really that useful to
you? Have you come across them accidentally and became surprised? The &lt;code&gt;helper&lt;/code&gt;
alias to &lt;code&gt;man&lt;/code&gt; particularly bothers me. But also, I don&#x27;t want heroku aliases
in my user land.&lt;/p&gt;
&lt;p&gt;Even if some aliases or plugins are useful, you can copy the ones you want, and
just plug into your &lt;code&gt;.bashrc&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;In the end of the day oh-my-zsh plugins are just bash files written in zsh
syntax, many of which are compatible with plain bash, or can be easily ported.&lt;/p&gt;
&lt;p&gt;There is no versioning control or anything fancy like that. Just files. This is
one of the reasons many people won&#x27;t categorise oh-my-zsh as a plug-in manager
(and why it put it between quote marks when I mentioned it earlier), as it
lacks so many features to that end.&lt;/p&gt;
&lt;p&gt;For me it comes down to: I don&#x27;t need this technology, and it does not add much
value to my daily use of my computer, therefore it must go.&lt;/p&gt;
&lt;p&gt;Next are the reasons why I stopped using zsh.&lt;/p&gt;
&lt;h2&gt;zsh&lt;/h2&gt;
&lt;p&gt;One thing that people aren&#x27;t really aware of is that zsh doubles as a scripting
language of its own. They might not realise this until they share a script with
someone, and that script doesn&#x27;t run on their machine.&lt;/p&gt;
&lt;p&gt;For example, the syntax below is only available in zsh:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# All files that are NOT .c files (^ provides negation)
ls -d ^*.c

# Grouping
ls (foo|bar).*

# Recursive search with **
ls **/*bar
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are more advanced filename-generation patterns, but you get the idea.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;zsh&lt;/code&gt; also allows you to &lt;code&gt;cd&lt;/code&gt; into a directory just by typing its name&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;% cd /
% setopt autocd
% bin
% pwd
/bin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The official introduction page has a lot more examples of what is available.
You can check it &lt;a href=&quot;http://web.archive.org/web/20240503012616/https://zsh.sourceforge.io/Intro/intro_toc.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Some functionalities like expanding &lt;code&gt;/u/lo/b&lt;/code&gt; to &lt;code&gt;/usr/local/bin&lt;/code&gt; are things
that I do not want to have in my shell, and they strike me as bad patterns due
to the high risk of doing the wrong matching and expanding to the wrong dir
or file.&lt;/p&gt;
&lt;p&gt;But the biggest problem for me is that the zsh scripting language adds way too
many non-POSIX compliant features that end up confusing me a lot. I always have
to look up the syntax to make sure my zsh script isn&#x27;t going to be flawed in
another environment that doesn&#x27;t use zsh due to invalid syntax errors.&lt;/p&gt;
&lt;p&gt;This diminishes my ability to write good portable scripts.&lt;/p&gt;
&lt;p&gt;Part of this is skill-issue on my side (everything is!) as we know every
bash script should be POSIX compliant (joking, not even bash is POSIX
compliant), but nonetheless, for newcomers like I once was, picking up the
shell that looked the most &amp;quot;cool&amp;quot; was part of a factor for picking up a shell.&lt;/p&gt;
&lt;p&gt;This type of problem is more pronounced for me because I have several bash
scripts that I created overtime with zsh scripting not even knowing I was using
zsh scripting. This is a common newbie mistake to make, but when you just want
to get something going you often get into these types trade-offs that become
more pronounced later once you have mastered a few tools.&lt;/p&gt;
&lt;p&gt;So what is the alternative to all of this?&lt;/p&gt;
&lt;h2&gt;bash&lt;/h2&gt;
&lt;p&gt;Yep. I&#x27;m just using plain bash now and trying to figure out how far I can get
with it. So far I haven&#x27;t got a reason to get anything more featureful than
bash.&lt;/p&gt;
&lt;p&gt;I have been using &lt;code&gt;fzf&lt;/code&gt; in the terminal, which is a dependency I already had
and am familiar with, to deal with autocompletion and recursive command search
instead. The experience is much better than the zsh autocompletion.&lt;/p&gt;
&lt;p&gt;These are the lines in my .bashrc that turn on the fzf integration.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;source /usr/share/fzf/key-bindings.bash
source /usr/share/fzf/completion.bash
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will enable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ctrl+t list files+folders in current directory (e.g., type git add , press
Ctrl+t, select a few files using Tab, finally Enter)&lt;/li&gt;
&lt;li&gt;Ctrl+r search history of shell commands&lt;/li&gt;
&lt;li&gt;Alt+c fuzzy change directory&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is handy for me, as the functionality is similar to the &lt;code&gt;Telescope&lt;/code&gt; plugin
I have been using in neovim, and I can see a quick preview of files but also
a fuzzy-search output of the reverse search I&#x27;m performing at the time.&lt;/p&gt;
&lt;p&gt;Also, I use alacritty as my terminal.
Alacritty has vi key bindings, so I don&#x27;t need my shell to provide that for me,
one less feature I need from the shell!&lt;/p&gt;
&lt;p&gt;Most terminal emulators have some form of emacs or vi key bindings these days,
so this isn&#x27;t something necessary for a shell to support.&lt;/p&gt;
&lt;p&gt;But that said, I can still turn vi mode on bash with &lt;code&gt;set -o vi&lt;/code&gt;, so you can
choose between using vi mode on your shell or on your terminal.&lt;/p&gt;
&lt;p&gt;And that is pretty much it.&lt;/p&gt;
&lt;p&gt;Nothing fancy - just getting rid of new technology that doesn&#x27;t aggregate value
in my day-to-day activities.&lt;/p&gt;
&lt;p&gt;I&#x27;m on a journey to make my installation script as lean as possible to make
updating my system as fast as possible, and also to give my system less
entrypoints to break or be exploited.&lt;/p&gt;
&lt;p&gt;Granted, I haven&#x27;t had any bad experiences with zsh, but that alone doesn&#x27;t
mean I should re-check my previous assumptions and switch a particular
technology for something better (or just pick the boring tech that has always
been there to begin with).&lt;/p&gt;
&lt;p&gt;I have no plans to go more basic and further switch to &lt;code&gt;sh&lt;/code&gt; at this stage, but
I will be looking at &lt;code&gt;dash&lt;/code&gt; next to get the sweet performance enhancements and
something that is more POSIX compliant than bash.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2024-05-08T00:00:00Z</published><updated>2024-07-06T00:00:00Z</updated></entry><entry><title>Branchless Programming Experiments in C++ and Python</title><link href="https://www.marcelofern.com/posts/cpp/branchless_programming/index.html"/><id>tag:marcelofern.com,2023-08-22:/cpp/branchless_programming/index.html</id><content type="html">&lt;h1&gt;Branchless Programming Experiments in C++ and Python&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created: 2023-08-22
Updated: 2024-07-28
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This article talks about high-level theoretical concepts of branchless
programming, along with examples of branchless programming in C++ and Python.&lt;/p&gt;
&lt;h2&gt;What&#x27;s branchless programming and why does it matter?&lt;/h2&gt;
&lt;p&gt;A branchless program is a program that doesn&#x27;t include any conditional
operator (&lt;code&gt;if&lt;/code&gt;, &lt;code&gt;else&lt;/code&gt;, &lt;code&gt;switch&lt;/code&gt;, ...).&lt;/p&gt;
&lt;p&gt;The reason why people would go through the trouble of branchless programming is
onefold: performance.&lt;/p&gt;
&lt;p&gt;Modern CPUs try to read future instructions before they are executed so that
they can stay ahead of the game. This is called &amp;quot;instruction pipelining&amp;quot;, and
is meant to implement instruction-level parallelism on single processors.&lt;/p&gt;
&lt;p&gt;However, when the CPU is pipelining and a branch is present, the CPU won&#x27;t be
able to know what path it needs to run, so it takes a guess. When this guess is
incorrect, the CPU discards the instructions previously read, and read the
new instruction set for the correct path. This takes time and valuable clock
cycles.&lt;/p&gt;
&lt;p&gt;UPDATE: According to the author of the CSAPP book, microprocessors are
architected in a way to achive branch prediction success rates of about 90%.
The author also provides an estimation of 15 to 30 clock cycles of wasted work
when the branch prediction fails.&lt;/p&gt;
&lt;h2&gt;How does Instruction Pipelining work?&lt;/h2&gt;
&lt;p&gt;The CPU is composed of multiple processor units. Each processor unit performs
an instruction such as adding two numbers, comparing two numbers, jumping to a
different part of a program, loading and storing data in memory, etc. Those
operations are hardwired into the circuitry of the processor inside the CPU.&lt;/p&gt;
&lt;p&gt;When the CPU is asked to perform an instruction, it will receive an &lt;code&gt;opcode&lt;/code&gt;,
which is just a unique binary number that the CPU will decode into
controlling signals that will orchestrate the behaviour of the CPU.&lt;/p&gt;
&lt;p&gt;The CPU executes an instruction by fetching it from memory (either the
computer&#x27;s memory or the CPU cache), following up by decoding the &lt;code&gt;opcode&lt;/code&gt;,
executing the instruction itself in the processor, and storing it back to
memory.&lt;/p&gt;
&lt;p&gt;In a nutshell, a pipeline is consisted of four stages: &lt;strong&gt;fetch&lt;/strong&gt;, &lt;strong&gt;decode&lt;/strong&gt;,
&lt;strong&gt;execute&lt;/strong&gt;, &lt;strong&gt;write-back&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Each one of those stages will be handled by a circuit in the CPU. So whenever
a instruction needs to be run, there are &lt;strong&gt;4 high-level steps until the result
is finally stored in memory.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Pipeline analogy time&lt;/h2&gt;
&lt;p&gt;Imagine you are going to a buffet restaurant with 4 different dishes. This is
a peculiar restaurant, and you need to wait for the person in front of you to
go through all the 4 dishes and pay for it before you can go down and start
serving yourself.&lt;/p&gt;
&lt;p&gt;This is a waste of time. A better way of serving people is to only wait for
the person in front of you to go through the first dish before you start
serving yourself.&lt;/p&gt;
&lt;p&gt;This is what CPUs try to do by &amp;quot;pipelining&amp;quot; the work. While one instruction
is being &lt;code&gt;decoded&lt;/code&gt;, the following one is already being &lt;code&gt;fetched&lt;/code&gt;. When the
first instruction is decoded and starts being executed, now the second one
starts being decoded, and a third one is fetched, and so on so forth...&lt;/p&gt;
&lt;p&gt;This is how it looks visually (image borrowed from wikipedia):&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;pipelining_happy_path.png&quot; alt=&quot;pipelining_happy_path&quot;&gt;&lt;/p&gt;
&lt;p&gt;What happens if the person in front of you in the buffet grabbed all the chips
from the buffet plate, and if you had known that in advance, you would go back
and put another spoon of mashed potatos on your plate?&lt;/p&gt;
&lt;p&gt;That happens &lt;em&gt;a lot&lt;/em&gt; in the CPU when the next instruction depends on the
execution of the current one. In this case, the CPU needs to wait for the
first instruction to resolve before executing the next one, and this incurs a
time penalty.&lt;/p&gt;
&lt;p&gt;In the example below, during cycle 3 the purple instruction can only be decoded
once the green one is executed. A bubble is created to represent that during
cycle 3 the &lt;code&gt;decode&lt;/code&gt; step will be idle, and subsequently on cycle 4 the
&lt;code&gt;execute&lt;/code&gt; step will be idle and so on so forth until the bubble is out of
the pipeline - at which point execution resumes normally.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;pipelining_sad_path.png&quot; alt=&quot;pipelining_sad_path&quot;&gt;&lt;/p&gt;
&lt;p&gt;Sometimes it is even worse than this, you might have an &lt;code&gt;if/else&lt;/code&gt; statement
in your code, and the CPU tried to guess which one to load beforehand, but it
it guessed the wrong one. Now it has to flush all of those instructions out of
the pipeline and load the correct ones.&lt;/p&gt;
&lt;p&gt;Here is where branchless programming comes handy. Code that doesn&#x27;t have
conditionals will likely have less erroneously-guessed instructions loaded as
the equivalent code with conditionals.&lt;/p&gt;
&lt;h2&gt;How do branches look in assembly language?&lt;/h2&gt;
&lt;p&gt;Let&#x27;s start with the strawman example. Here&#x27;s some simple C++ code with a
branch:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;int max(int a, int b) {
  if (a &amp;gt; b) {
    return b;
  } else {
    return a;
  }
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And the resulting assembly code (note: no optimisation flag turned on):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-assembly&quot;&gt;max(int, int):
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     DWORD PTR [rbp-8], esi
        mov     eax, DWORD PTR [rbp-4]
        cmp     eax, DWORD PTR [rbp-8]
        jle     .L2
        mov     eax, DWORD PTR [rbp-8]
        jmp     .L3
.L2:
        mov     eax, DWORD PTR [rbp-4]
.L3:
        pop     rbp
        ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will notice that we have two conditional jumps. The equivalent branchless
code looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;int max(int a, int b) {
    return a*(a &amp;gt; b) + b*(b &amp;gt;= a);
};
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-assembly&quot;&gt;max(int, int):
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     DWORD PTR [rbp-8], esi
        mov     eax, DWORD PTR [rbp-4]
        cmp     eax, DWORD PTR [rbp-8]
        setg    al
        movzx   eax, al
        imul    eax, DWORD PTR [rbp-4]
        mov     edx, eax
        mov     eax, DWORD PTR [rbp-8]
        cmp     eax, DWORD PTR [rbp-4]
        setge   al
        movzx   eax, al
        imul    eax, DWORD PTR [rbp-8]
        add     eax, edx
        pop     rbp
        ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This looks a bit more convoluted, and it has more instructions. However, we
got rid of those jumps.&lt;/p&gt;
&lt;p&gt;This example is terrible, and it&#x27;s chosen on purpose. The first function, can
be very easily optimised by the compiler if we use the flag &lt;code&gt;-O3&lt;/code&gt;. Generating
this assembly code:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-assembly&quot;&gt;max(int, int):
        cmp     edi, esi
        mov     eax, esi
        cmovle  eax, edi
        ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Whereas for the second code, even with the optimisation flag on, the underlying
assembly code is worse as the compiler can&#x27;t optimise it further:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-assembly&quot;&gt;max(int, int):
        xor     eax, eax
        cmp     edi, esi
        cmovle  edi, eax
        cmovg   esi, eax
        lea     eax, [rdi+rsi]
        ret
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, the branchless C++ code fell apart due to the compiler being
really good at optimisations. One of such optimisations is using branchless
programming itself! However, this illustrates why it&#x27;s important to actually
see what the compiled code looks like. However, all things being equal,
branchless code &lt;strong&gt;will&lt;/strong&gt; be faster on an assembly level, and there will be many
times where the compiler can&#x27;t optimise the code (like when you have &lt;code&gt;volatile&lt;/code&gt;
variables all over).&lt;/p&gt;
&lt;h2&gt;What about interpreted languages?&lt;/h2&gt;
&lt;p&gt;Many interpreted languages don&#x27;t have the cleverness for optimisation of a GCC
compiler, and in many cases, code ran by the virtual machine is murky to the
outsiders eyes. Nevertheless, I work with Python at the moment and it would be
interesting to see what happens once branchless programming takes over.&lt;/p&gt;
&lt;p&gt;Using the same example in Python we have:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def max(a, b):
    if a &amp;gt; b:
        return a
    else:
        return b
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And this is the disassembled Python byte code into mnemonics:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;  2           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 COMPARE_OP               4 (&amp;gt;)
              6 POP_JUMP_IF_FALSE        6 (to 12)

  3           8 LOAD_FAST                0 (a)
             10 RETURN_VALUE

  5     &amp;gt;&amp;gt;   12 LOAD_FAST                1 (b)
             14 RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First things first, what is happening under the hood? For every bytecode
instruction that is executed, the interpreter will branch out many times.
The comparison operator &lt;code&gt;&amp;gt;&lt;/code&gt; for example, requires a branch to check for the
opcode equivalent of &lt;code&gt;&amp;gt;&lt;/code&gt;, another branch to verify if the object being
compared has a &lt;code&gt;__gt__&lt;/code&gt; method, more branches to verify if both objects
being compared are valid for the comparison being performed, and many other
branches until the value of the function call is actually computed and
returned.&lt;/p&gt;
&lt;p&gt;We cannot compare Python bytecode with a single machine-level instruction,
because a single bytecode instruction will perform many machine-level
instructions inside the interpreter. Also, some Python bytecode instructions
like calling a function are more expensive than other simpler ones like
performing a mathematical operation like adding.&lt;/p&gt;
&lt;p&gt;With all the conditional compilation clutter removed from CPython, the code
that evaluates a piece of bytecode into a C instruction is as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;PyObject* _PyEval_EvalFrameDefault(/* ... */ ) {
    // context setup
    for (;;) {
        // periodic check
        switch (opcode) {
            case TARGET(LOAD_FAST): {
                PyObject *value = GETLOCAL(oparg);
                if (value == NULL) {
                    format_exc_check_arg(/* ... */ );
                    goto error;
                }
                Py_INCREF(value);
                PUSH(value);
                FAST_DISPATCH();
            }
            case TARGET(STORE_FAST): {
                PyObject *value = POP();
                SETLOCAL(oparg, value);
                FAST_DISPATCH();
            }
            case TARGET(BINARY_MULTIPLY): {
                PyObject *right = POP();
                PyObject *left = TOP();
                PyObject *res = PyNumber_Multiply(left, right);
                Py_DECREF(left);
                Py_DECREF(right);
                SET_TOP(res);
                if (res == NULL)
                goto error;
                DISPATCH();
            }
        /* ... */
        }
    }
error:
    // exception unwinding
}
    // context cleanup
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The full implementation is &lt;a href=&quot;https://github.com/python/cpython/blob/3.8/Python/ceval.c#L1323&quot;&gt;here&lt;/a&gt;.
The interesting bit is that even for a simple instruction like &lt;code&gt;LOAD_FAST&lt;/code&gt;, we
can see a branch in the top-level case statement handler.&lt;/p&gt;
&lt;p&gt;This means that to get a rough estimation of how two functions compare, we&#x27;ll
need to check how many bytecode instructions there are, and how expensive those
bytecode instructions are.&lt;/p&gt;
&lt;p&gt;At the moment of writing, I haven&#x27;t found a handy table of Python bytecodes
ordered from more-overhead to less-overhead, so we&#x27;ll analyse one by one.&lt;/p&gt;
&lt;p&gt;Our &lt;code&gt;max(a, b)&lt;/code&gt; function above had the following instructions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;LOAD_FAST&lt;/code&gt; (4x): Performs an index lookup in the local variables array to
load the variable. This is pretty fast.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;COMPARE_OP&lt;/code&gt; (1x): Has a very high overhead when the comparison operator
is not just checking object identity as it needs to look at what is
in the dunder method for the particular comparison.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;POP_JUMP_IF_FALSE&lt;/code&gt; (1x): Has a low overhead from the interpreter&#x27;s
perspective as the next position to jump to is not hard to find out by
reading the bytecode.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;RETURN_VALUE&lt;/code&gt; (2x): This just pops the stack, nice and easy.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;How about the branchless version?&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def max(a, b):
    return a*(a &amp;gt; b) + b*(b &amp;gt;= a)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;opcodes:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;  2           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                0 (a)
              4 LOAD_FAST                1 (b)
              6 COMPARE_OP               4 (&amp;gt;)
              8 BINARY_MULTIPLY
             10 LOAD_FAST                1 (b)
             12 LOAD_FAST                1 (b)
             14 LOAD_FAST                0 (a)
             16 COMPARE_OP               5 (&amp;gt;=)
             18 BINARY_MULTIPLY
             20 BINARY_ADD
             22 RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We already can tell that this will not be light on the interpreter due to
having double &lt;code&gt;COMPARE_OP&lt;/code&gt; instructions. The other differences here are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;BINARY_MULTIPLY&lt;/code&gt;: Surprisingly this has a considerable amount of overhead.
The interpreter needs to figure out the types being multiplied and find
their underlying multiply function before they can actually be multiplied.
So a &amp;quot;binary multiply&amp;quot; does not mean the interpreter will just process a
C &lt;code&gt;*&lt;/code&gt; between the two operands.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;BINARY_ADD&lt;/code&gt; is very similar to the above, curiously enough it seems like
someone tried to optimise int summation &lt;a href=&quot;https://github.com/python/cpython/blob/3.8/Python/ceval.c#L1547&quot;&gt;but failed&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/* NOTE(haypo): Please don&#x27;t try to micro-optimize int+int on
   CPython using bytecode, it is simply worthless.
   See http://bugs.python.org/issue21955 and
   http://bugs.python.org/issue10044 for the discussion. In short,
   no patch shown any impact on a realistic benchmark, only a minor
   speedup on microbenchmarks. */
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In conclusion this kind of branchless optimisation does not quite work with
Python. However, due to time, I haven&#x27;t really analysed other branchless
techniques that are superior in many situations like bit masking.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2023-08-22T00:00:00Z</published><updated>2024-07-28T00:00:00Z</updated></entry><entry><title>A Critique of SOLID</title><link href="https://www.marcelofern.com/posts/software-design/a-critique-of-solid/index.html"/><id>tag:marcelofern.com,2023-04-15:/software-design/a-critique-of-solid/index.html</id><content type="html">&lt;h1&gt;A Critique of SOLID&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created: 2023-04-15
Updated: 2024-07-06
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;SOLID is an acronym coined by Robert C. Martin (also known as uncle Bob),
particularly focused at making Object Oriented Programming designs easier to
understand, maintain, and adapt.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://web.archive.org/web/20150906155800/http://www.objectmentor.com/resources/articles/Principles_and_Patterns.pdf&quot;&gt;original paper (archived)&lt;/a&gt;
introducing the term in 2000 is a quick read worth checking, even if only for
historical context.&lt;/p&gt;
&lt;p&gt;Before the paper starts to talk about SOLID, it mentions the 4 symptoms of
&amp;quot;rotting software&amp;quot; (a very popular term between 1998-2006 according to
google ngram). Those 4 symptoms are: &lt;strong&gt;rigidity&lt;/strong&gt;, &lt;strong&gt;fragility&lt;/strong&gt;,
&lt;strong&gt;immobility&lt;/strong&gt; and &lt;strong&gt;viscosity&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;It is important to know what those terms mean, because they are the reason that
SOLID principles exist in the first place. Here is a brief summary:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rigidity: How difficult it is to change the code.&lt;/li&gt;
&lt;li&gt;Fragility: How easy it is to break the code.&lt;/li&gt;
&lt;li&gt;Immobility: How hard it is to reuse existing code.&lt;/li&gt;
&lt;li&gt;Viscosity: How hard it is to preserve the existing design of code when
developing new changes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The general expectation set by Robert C. Martin is that if you follow the SOLID
principles your code will experience less software rot.&lt;/p&gt;
&lt;h2&gt;SOLID&lt;/h2&gt;
&lt;p&gt;The 5 principles of object oriented &lt;strong&gt;class&lt;/strong&gt; design (as called by the paper),
are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;S&lt;/strong&gt;ingle responsibility principle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;O&lt;/strong&gt;pen-closed principle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L&lt;/strong&gt;iskov substitution principle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;I&lt;/strong&gt;nterface segregation principle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;D&lt;/strong&gt;ependency inversion principle.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My plan is to analyse each of them critically, understanding their weaknesses
and providing evidence to counter their adoption.&lt;/p&gt;
&lt;h3&gt;The Open Closed Principle (OPC)&lt;/h3&gt;
&lt;p&gt;According to Martin this is the most important principle, and we will start
with it. Here is its definition:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A module should be open for extension but closed for modification.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This sounds simple and reasonable. Perfect code that addresses a particular
problem is only needed to be written once. To address further problems, this
perfect code doesn&#x27;t need to change, but only be &lt;em&gt;extended&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;When developers keep this in mind, they try to produce perfect code that can be
extended whilst keeping its original elegance and efficiency. If that&#x27;s
materialised, the developer can be said to be following the OPC principle.&lt;/p&gt;
&lt;p&gt;You probably can see where this is going by my wording (&amp;quot;perfect code&amp;quot;,
whatever that actually means). But let&#x27;s not diverge from the theme just yet,
we will go through an example of code provided by Martin that violates the OPC
principle:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;struct Modem {
    enum Type {hayes, courrier, ernie) type;
};

struct Hayes {
    Modem::Type type;
    // Hayes related stuff
};

struct Courrier {
    Modem::Type type;
    // Courrier related stuff
};

struct Ernie
{
    Modem::Type type;
    // Ernie related stuff
};

void LogOn(Modem&amp;amp; m, string&amp;amp; pno, string&amp;amp; user, string&amp;amp; pw) {
    if (m.type == Modem::hayes)
        DialHayes((Hayes&amp;amp;)m, pno);
    else if (m.type == Modem::courrier)
        DialCourrier((Courrier&amp;amp;)m, pno);
    else if (m.type == Modem::ernie)
        DialErnie((Ernie&amp;amp;)m, pno)
    // ...you get the idea
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;LogOn&lt;/code&gt; function violates Martin&#x27;s OPC because its inner code needs to
change every time a new modem is added. Moreover, every modem type depends on
the &lt;code&gt;struct Modem&lt;/code&gt;. Therefore if we need to add a new type of modem to that
struct, all the existing modems need to be recompiled.&lt;/p&gt;
&lt;p&gt;So how do we make this code better? According to Martin:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Abstraction is the key to the OCP&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There&#x27;re a few possible abstraction techniques that conform to OPC. Let&#x27;s
follow the first one Martin provides, &lt;strong&gt;Dynamic Polymorphism&lt;/strong&gt;.&lt;/p&gt;
&lt;h4&gt;Dynamic Polymorphism&lt;/h4&gt;
&lt;p&gt;In Object Oriented Programming this means we will have an abstract class with
abstract methods (virtual functions), and concrete child classes implementing
the actual code for each method.&lt;/p&gt;
&lt;p&gt;Here the word &amp;quot;Dynamic&amp;quot; means that the form (concrete class implementation) is
found during run time. Putting it all together and rewriting the code above, we
end up with something like the example provided in Martin&#x27;s paper:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;class Modem {
public:
    virtual void Dial(const string&amp;amp; pno) = 0;
    // other virtual methods here, you get the idea.
};

void LogOn(Modem&amp;amp; m, string&amp;amp; pno, string&amp;amp; user, string&amp;amp; pw) {
    m.Dial(pno);
    // you get the idea.
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now the class &lt;code&gt;Modem&lt;/code&gt; is closed for modification when we need to add new types
of modems. To use the LogOn function, you need code like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;class Hayes : public Modem {
public:
    virtual void Dial(const std::string&amp;amp; pno) {
        // do something...
    }
};


int main() {
    Hayes hayes = Hayes();
    string pno = &amp;quot;1&amp;quot;;
    LogOn(hayes, pno);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we see the whole picture we can tell that the function &lt;code&gt;LogOn&lt;/code&gt; calls
the method &lt;code&gt;Dial&lt;/code&gt; on the child instance. This child instance is only
available at runtime. Note that Martin omits this initialisation code in his
original paper. He also does not discuss the downsides of this approach.&lt;/p&gt;
&lt;p&gt;However, I don&#x27;t think we should toss this example away too quickly.&lt;/p&gt;
&lt;p&gt;We know for a fact that when types are only resolved at runtime our code
becomes slower. But why? For each class that inherits virtual functions the
compiler creates a &lt;code&gt;vtable&lt;/code&gt;. A &lt;code&gt;vtable&lt;/code&gt; is essentially a table of function
pointers. When an object is created at runtime, a pointer to its vtable is
added to its memory layout, and when a virtual function is called on this
object, the code looks at the appropriate function pointer on the object vtable
and then calls the function through that pointer.&lt;/p&gt;
&lt;p&gt;This all means that by using &lt;strong&gt;Dynamic polymorphism&lt;/strong&gt; our code requires
an additional level of indirection for &lt;em&gt;each&lt;/em&gt; virtual function call, and the
memory footprint of objects will increase program memory usage too.&lt;/p&gt;
&lt;p&gt;Furthermore, the given example is a simplistic one. In cases where dynamic
polymorphism is overused, the code may have convoluted inheritance
hierarchies. Such complexity impairs compiler optimization as virtual function
calls aren&#x27;t assignable until runtime.&lt;/p&gt;
&lt;p&gt;You might be thinking that the code above looks clean and that there&#x27;s no
reason to not do it. After all, reduced performance and more memory usage is
a fair price to pay for cleaner code,  right? Well. What if we didn&#x27;t have
to pay this price and still have &amp;quot;clean code&amp;quot;?&lt;/p&gt;
&lt;p&gt;We will get there, but to keep things fair we need to go through another
abstraction example given by Martin.&lt;/p&gt;
&lt;h4&gt;Static Polymorphism&lt;/h4&gt;
&lt;p&gt;What if there was a way to have something similar to Dynamic Polymorphism but
without the runtime overhead, with more compiler optimisation, and more
control over object types? Here is where static polymorphism comes handy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Static&lt;/strong&gt; here means that the child type will be defined at compile time. But
how? The answer is by using generic programming templates.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;template &amp;lt;typename MODEM&amp;gt;
void LogOn(MODEM&amp;amp; m, string&amp;amp; pno, string&amp;amp; user, string&amp;amp; pw) {
    m.Dial(pno);
    // you get the idea.
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This looks alright, but is this a real improvement? What is the trade off?&lt;/p&gt;
&lt;p&gt;Each time a template is instantiated a new copy of the code is created. This
is how compilers make generic programming type-safe at compiling time. The more
instantiations your code has, the larger your executable becomes, and the
longer compilation takes.&lt;/p&gt;
&lt;p&gt;This all means that we are trading runtime performance for compile time
performance. It might be OK to do so in certain applications, but what if we
didn&#x27;t have to?&lt;/p&gt;
&lt;h4&gt;Addressing the open-closed principle&lt;/h4&gt;
&lt;p&gt;Martin&#x27;s original argument stressed the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The dependency between the Modem struct and its implementation structs is
bad.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;My problem with this assertion is that it sounds like a straw man fallacy. You
absolutely don&#x27;t need to create one struct for each type of modem. Let&#x27;s fix
the example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;enum ModemType { HAYES, COURRIER, ERNIE };

struct Modem {
    ModemType type;
    // other attributes, you get the idea.
};

void LogOn(Modem&amp;amp; m, string&amp;amp; pno) {
    switch (m.type) {
        case ModemType.HAYES:
            // do something with hayes
        case ModemType.COURRIER:
            // do something with courrier
        // ...you get the idea
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Now the compiler knows all the code paths the code goes through at compiling
time. Code can easily be optimised.&lt;/li&gt;
&lt;li&gt;No layers of indirection and thus no performance costs at runtime.&lt;/li&gt;
&lt;li&gt;No memory overhead.&lt;/li&gt;
&lt;li&gt;No executable size overhead.&lt;/li&gt;
&lt;li&gt;Less lines of code.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Is the code above objectively bad? Is it difficult to read? I could argue that
any CS student could make sense of this code in a very short amount of time.&lt;/p&gt;
&lt;p&gt;The idea behind polymorphism is that it makes code flexible and easier to read.
However, the flexibility of using virtual functions can be costly as we have
seen. If you don&#x27;t know whether you need that flexibility yet, you will be
imagining future use cases that &lt;em&gt;may&lt;/em&gt; use your code. If this flexibility ends
up not being needed, the programmer has fundamentally over-scoped their own
code and caused unnecessary memory and CPU degradation to their program with no
added benefit.&lt;/p&gt;
&lt;p&gt;Here is where the trade-off lies. Imagine that you need to add a new type of
modem. Using the &lt;code&gt;switch&lt;/code&gt; above you&#x27;d have to change the LogOn function and
recompile it. You will also need to recompile everything that depends on the
LogOn function, and that can be a lot. If you were using polymorphism, you just
needed to add a new type (potentially in a new file), and you would need to
only compile one single file plus the place where it is instantiated (and
everything that depends on that).&lt;/p&gt;
&lt;p&gt;But what if you need to add new functionality in the abstract class? In the
polymorphism case you&#x27;d need to add a new function for every single type, and
that would induce a lot of recompilation across the project for a fairly use
piece of code.&lt;/p&gt;
&lt;p&gt;Differently, in the &lt;code&gt;switch&lt;/code&gt; case, you could add a new function (potentially in
a new file) and you only need to compile that one single file.&lt;/p&gt;
&lt;p&gt;Someone might argue that a potential downside from the &lt;code&gt;switch&lt;/code&gt; approach is
that when a new type of Modem is added, you need to track all the functions
that have a &lt;code&gt;switch (m.type)&lt;/code&gt; to change their code, and this can induce human
error.&lt;/p&gt;
&lt;p&gt;I would tend to agree. However, compiler warnings will let you know of all
these use cases that are missing a switch handle. For example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;#include &amp;lt;stdio.h&amp;gt;

enum ModemType { HAYES, COURRIER, ERNIE };

struct Modem {
    enum ModemType type;
};

void LogOn(struct Modem* m) {
  switch (m-&amp;gt;type) {
    // Note how we are only handling HAYES and
    // forgot to handle COURRIER and ERNIE.
    case HAYES:
      printf(&amp;quot;LogOn HAYES&amp;quot;);
      break;
  }
}

int main() {
  struct Modem m = {HAYES};
  LogOn(&amp;amp;m);
  return 0;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If I try to run this code with:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;gcc -Wall &amp;lt;file_name&amp;gt; -o /tmp/a.o
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I get the following error:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/tmp/main.c: In function ‘LogOn’:
/tmp/main.c:10:3: warning: enumeration value ‘COURRIER’ not handled in switch [-Wswitch]
   10 |   switch (m-&amp;gt;type) {
      |   ^~~~~~
/tmp/main.c:10:3: warning: enumeration value ‘ERNIE’ not handled in switch [-Wswitch]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In other words, the compile catches this mistake for us.&lt;/p&gt;
&lt;p&gt;In conclusion, this design choice will determine what will be easier and what
will be harder for your program. Ask yourself the question: &amp;quot;Do I want it to be
easier to add more functionality to my program, or do I prioritise adding more
types?&amp;quot;. The answer will dictate what is best for your situation.&lt;/p&gt;
&lt;p&gt;Polymorphism isn&#x27;t a silver bullet that will always make your code OPC
compliant. In fact, the more functional your code looks like, the greater the
penalty of using polymorphism.&lt;/p&gt;
&lt;p&gt;In my professional experience, I spend much more time adding new functionality
to existing software than I spend adding new types. So for me, it would be
inadequate to prefer an architecture that focuses on types.&lt;/p&gt;
&lt;h3&gt;The Liskov Substitution Principle (LSP)&lt;/h3&gt;
&lt;p&gt;This principle states that an object (such as an abstract class), may be
replaced by a sub-object (such as a child class) without breaking the program.&lt;/p&gt;
&lt;p&gt;This means that if we have a function &lt;code&gt;foo(Parent bar)&lt;/code&gt;, we should also be
able to call foo as &lt;code&gt;foo(Child bar)&lt;/code&gt; without altering the correctness of the
program.&lt;/p&gt;
&lt;p&gt;Martin uses an example of a parent class called &lt;code&gt;Ellipse&lt;/code&gt; and a child class
called &lt;code&gt;Circle&lt;/code&gt;. As every circle is an ellipse with a very particular
configuration, this sounds about right. However, Martin only uses this example
to stress that it is an inheritance &lt;em&gt;bad&lt;/em&gt; practice.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cpp&quot;&gt;void f(Ellipse&amp;amp; e) {
    Point a(-1,0);
    Point b(1,0);
    e.SetFoci(a,b);
    e.SetMajorAxis(3);
    assert(e.GetFocusA() == a);
    assert(e.GetFocusB() == b);
    assert(e.GetMajorAxis() == 3);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The function &lt;code&gt;f&lt;/code&gt; above, for example, can&#x27;t receive a &lt;code&gt;Circle&lt;/code&gt; instead of an
&lt;code&gt;Ellipse&lt;/code&gt;. Reason being that the method &lt;code&gt;setFoci&lt;/code&gt; will alter the circle and
turn it into an ellipse. This can become a subtle bug in the application code.&lt;/p&gt;
&lt;p&gt;A safe option is for the &lt;code&gt;Circle::SetFoci&lt;/code&gt; method to add an extra validation,
asserting that &lt;code&gt;a == b&lt;/code&gt;. This violates LSP. The child object now has an extra
restriction that the parent object doesn&#x27;t. Martin concludes that:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;derived methods should expect no more and provide no less.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don&#x27;t have much to critique about this principle itself. To be honest it
sounds alright to me. However, this is only &lt;strong&gt;one more thing&lt;/strong&gt; to remember when
you&#x27;re using polymorphism, which contributes as a &lt;strong&gt;negative&lt;/strong&gt; to the OPC
principle through abstraction that we discussed above.&lt;/p&gt;
&lt;p&gt;Apart from performance and memory degradation, the programmer also has to worry
about loose/tight contracts between the parent and the child. This means that
the functionality may not work that well if the type implementation is
faulty.&lt;/p&gt;
&lt;p&gt;Martin himself comments that LSP violation can be costly. If the interface is
being used in many different places, the cost of repairing this violation might
be too much to take. A possible solution, as stated by Martin, is to provide an
&lt;code&gt;if/else&lt;/code&gt; statement to make sure the Ellipse is indeed an Ellipse.&lt;/p&gt;
&lt;p&gt;Another problem of this type of polymorphism is that the Circle object is way
simpler than an Ellipse. This means that the Circle class will inherit many
methods that it doesn&#x27;t need to use, generating &lt;strong&gt;method spam&lt;/strong&gt;. This violates
another principle of SOLID, the Interface Segregation Principle that we will
look into later on.&lt;/p&gt;
&lt;h2&gt;The Dependency Inversion Principle (DIP)&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Depend upon Abstractions. Do not depend upon concretions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In essence, this principle means that code in high level modules should not
depend on code in low level modules. High level modules are usually top
abstractions that express the policies of the application, whereas low level
modules contain the implementation details. In other words, the abstraction
should not worry about the implementation.&lt;/p&gt;
&lt;p&gt;By inverting the dependency, i.e., making the low level modules depend on the
high ones and not the opposite, a developer will be DIP complaint.&lt;/p&gt;
&lt;p&gt;You might be noticing a trend now. Most of these SOLID rules are greatly
related to polymorphism. After all, they were created for OOP. As I feel like I
have addressed the main trade-off in that approach, I could easily end up
repeating myself here. Let&#x27;s proceed nonetheless :&#x27;)&lt;/p&gt;
&lt;p&gt;Inverting dependencies in a codebase must be seen as a trade-off. As now your
abstraction can be changed without having to worry about the implementation
details, you can&#x27;t change the implementation details without having to worry
about the abstraction. This means that if the interface changes often, it will
be harder to manage the concrete implementations.&lt;/p&gt;
&lt;p&gt;More over, classes that could simply be a concrete implementation alone, with
no dependency on a high-level interface, now may have an artificial interface
so that this principle isn&#x27;t violated. This usually happens because &amp;quot;you never
know whether another concrete implementation that needs to use the interface
might come about&amp;quot;. One regular example is an interface &lt;code&gt;ShipmentPolicy&lt;/code&gt; which
has &lt;strong&gt;only one&lt;/strong&gt; concrete implementation, often called &lt;code&gt;ShipmentPolicyImpl&lt;/code&gt;.
One could say that this is a code redundancy.&lt;/p&gt;
&lt;h2&gt;The Interface Segregation Principle (ISP)&lt;/h2&gt;
&lt;p&gt;ISP states that no code should depend on methods it does not use. This implies
having smaller interfaces so that clients that depend on them only need to know
a few relevant methods.&lt;/p&gt;
&lt;p&gt;This idea sounds reasonable, and it&#x27;s a way of controlling inheritance method
spam when inheriting from bloated interfaces. But I think this technique is
more of a refactoring tool rather than a principle itself.&lt;/p&gt;
&lt;p&gt;Treating ISP as a principle can add unnecessary complexity. You should design
your code to solve the problem at hand rather than worrying about whether your
interface will become too bloated in future when more use cases are added. As
you don&#x27;t know how your codebase will progress in future, you shouldn&#x27;t make
compromises from early on. If an interface grew too much, and &lt;strong&gt;now&lt;/strong&gt; you have
legitimate reason to split it into smaller ones, then go and refactor it.&lt;/p&gt;
&lt;p&gt;As all the other principles we have discussed so far, take ISP as a
trade-off instead of a principle.&lt;/p&gt;
&lt;p&gt;Here are a few quotes from the Code Complete book that are relevant for the
matter:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If the derived class isn&#x27;t going to adhere completely to the same interface
contract defined by the base class, inheritance is not the right
implementation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;Be suspicious of base classes of which there is only one derived class.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Single Responsibility Principle (SRP)&lt;/h2&gt;
&lt;p&gt;This principle wasn&#x27;t included in the original publication I linked above, and
more details can be found in this blog post from &lt;a href=&quot;http://web.archive.org/web/20240328163818/https://blog.cleancoder.com/uncle-bob/2014/05/08/SingleReponsibilityPrinciple.html&quot;&gt;clean
coder&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This principle builds up on the &amp;quot;Separation of Concerns&amp;quot; term that was
popularised in a famous article &amp;quot;On the role of scientific thought&amp;quot; by Dijkstra
&lt;a href=&quot;http://web.archive.org/web/20221104003446/https://www.cs.utexas.edu/~EWD/ewd04xx/EWD447.PDF&quot;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The principle can be summarised as:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Single Responsibility Principle (SRP) states that each software module
should have one and only one reason to change.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The definition lacks specification and my major criticism here is the
specification itself. What is a reasonable &amp;quot;reason to change&amp;quot;?&lt;/p&gt;
&lt;p&gt;Martin gives more information on the blog post linked about, saying:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Imagine you took your car to a mechanic in order to fix a broken electric
window. He calls you the next day saying it’s all fixed. When you pick up
your car, you find the window works fine; but the car won’t start. It’s not
likely you will return to that mechanic because he’s clearly an idiot.
...
That’s how customers and managers feel when we break things they care about
that they did not ask us to change.
...
Another wording for the Single Responsibility Principle is:
Gather together the things that change for the same reasons. Separate those
things that change for different reasons.
...
However, as you think about this principle, remember that the reasons for
change are people. It is people who request changes. And you don’t want to
confuse those people, or yourself, by mixing together the code that many
different people care about for different reasons.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I agree that mixing code from different teams with different responsibilities
isn&#x27;t ideal. I would call that unnecessary coupling.&lt;/p&gt;
&lt;p&gt;It is obviously bad, for example, for a change in a back-end billing engine of
a bank to affect its front-end application and display data in a different
format.&lt;/p&gt;
&lt;p&gt;Although I think that the definition isn&#x27;t ideal, the principle here does sound
like the most reasonable in the list.&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;In my opinion, those shouldn&#x27;t be called &amp;quot;principles&amp;quot;. The word &lt;em&gt;principle&lt;/em&gt; as
it is defined below should be reserved for terms that are really hard to
debunk:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Principle: a fundamental truth or proposition that serves as the foundation
for a system of belief or behaviour or for a chain of reasoning. &amp;quot;the basic
principles of justice&amp;quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Most of what we have seen here fit within the saying &amp;quot;different horses for
different courses&amp;quot;. I do appreciate that when those principles were written
OOP was all the rage and many people were investing resources in it and many
books were written. So I can accept that they were the single truth at the
time.&lt;/p&gt;
&lt;p&gt;However, having many decades passed since the inception of those principles,
many other things have improved in the industry. I feel like we have moved
forward as a whole.&lt;/p&gt;
&lt;p&gt;OOP is not considered &lt;strong&gt;the only way&lt;/strong&gt; of developing any more, though still
very popular. Polymorphism, inheritance, and most importantly multi-level
inheritance is seen with bad eyes. More and more people have come to appreciate
composition over inheritance.&lt;/p&gt;
&lt;p&gt;The classic book example of inheritance Shape-&amp;gt;Ellipsis-&amp;gt;Circle is very hard to
derive in the real world in a non-forceful way. Inheritance and thus
polymorphism has become a way of getting generic functionality from parent
classes &lt;strong&gt;instead of sharing identities&lt;/strong&gt; between classes of the same base
implementation, and thus much tangled code has been created so that unrelated
classes could get the same shared behaviour. I feel like a lot of people
nowadays have scars to prove that.&lt;/p&gt;
&lt;p&gt;Nonetheless, I feel positive that Martin has created these principles even
though I don&#x27;t agree with them fully. It is easy to look back on the past and
point fingers about decisions that don&#x27;t apply to the present. I think that
overall the popularity of SOLID and the outcome of having more people thinking
about designs and their own set of principles is a positive thing.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2023-04-15T00:00:00Z</published><updated>2024-07-06T00:00:00Z</updated></entry><entry><title>Linux Rice</title><link href="https://www.marcelofern.com/posts/linux/rice/index.html"/><id>tag:marcelofern.com,2020-08-30:/linux/rice/index.html</id><content type="html">&lt;h1&gt;Linux Rice&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;Created: 2020-08-30
Updated: 2024-07-06
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;When someone is &amp;quot;ricing&amp;quot; their unix system, they are making functional and
visual customisations to their desktop. These changes could be anything from
changing the colour of a status bar to completely restructuring their computer
environment.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Productivity&lt;/strong&gt;: You can customise your applications and keyboard shortcuts
to satisfy your work-flow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: You are in control of what gets installed on your
application and not have to worry about unknown apps running on the
background.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privacy&lt;/strong&gt;: It is your system and the defaults in some distributions
can contain software that can spy on your behaviour like Canonical has done
in the past.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Visual Satisfaction&lt;/strong&gt;: Whatever colour scheme you like.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Because it is fun&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;How does it look like?&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;dark_theme.png&quot; alt=&quot;dark_theme&quot;&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ neofetch
                   -`                    x@x
                  .o+`                   ---
                 `ooo/                   OS: Arch Linux x86_64
                `+oooo:                  Host: 20Q5S01400 ThinkPad L490
               `+oooooo:                 Kernel: 6.6.39-1-lts
               -+oooooo+:                Uptime: 1 hour, 32 mins
             `/:-:++oooo+:               Packages: 519 (pacman)
            `/++++/+++++++:              Shell: bash 5.2.26
           `/++++++++++++++:             Resolution: 1920x1080
          `/+++ooooooooooooo/`           WM: i3
         ./ooosssso++osssssso+`          Theme: Adwaita [GTK2/3]
        .oossssso-````/ossssss+`         Icons: Adwaita [GTK2/3]
       -osssssso.      :ssssssso.        Terminal: alacritty
      :osssssss/        osssso+++.       Terminal Font: LiterationMono Nerd Font
     /ossssssss/        +ssssooo/-       CPU: Intel i7-8565U (8) @ 4.600GHz
   `/ossssso+/:-        -:/+osssso+-     GPU: Intel WhiskeyLake-U GT2 [UHD Graphics 620]
  `+sso+:-`                 `.-/+oso:    Memory: 1571MiB / 7134MiB
 `++:.                           `-/+/
 .`                                 `/

du -h /
# 5.6G
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Workflow Tools&lt;/h2&gt;
&lt;p&gt;Most of my tools revolve around &lt;a href=&quot;https://github.com/davatorium/rofi&quot;&gt;rofi&lt;/a&gt;
which is an application launcher.&lt;/p&gt;
&lt;p&gt;For example, my &amp;quot;TODO&amp;quot; list is a script that let&#x27;s me add, read, and remove
entries from a list via rofi:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;todo.png&quot; alt=&quot;todo&quot;&gt;&lt;/p&gt;
&lt;p&gt;You can configure rofi to be able to run any script you like.
Some of my scripts include: two-factor authentication, vpn connections, getting
a wayback-machine link to a website, compiling my website and rss feed, etc.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;rofi_scripts.png&quot; alt=&quot;rofi_scripts&quot;&gt;&lt;/p&gt;
&lt;h2&gt;Software used&lt;/h2&gt;
&lt;h3&gt;Neovim&lt;/h3&gt;
&lt;p&gt;I use Neovim for coding (work and personal projects), for taking notes, and
journaling.&lt;/p&gt;
&lt;p&gt;This website, for example, is entirely written from Neovim. The website is
built using markdown files that are parsed through a C program capable
of converting the &lt;code&gt;.md&lt;/code&gt; files into &lt;code&gt;.html&lt;/code&gt; files.&lt;/p&gt;
&lt;p&gt;After having tried so many auto-generators and converters, I decided to build
myself a simple and fast one. It was also such a fun C project.&lt;/p&gt;
&lt;p&gt;This is what editing this website feels like along with a snippet of the C
code used to compile the &lt;code&gt;.md&lt;/code&gt; files into &lt;code&gt;.html&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;neovim.png&quot; alt=&quot;neovim&quot;&gt;&lt;/p&gt;
&lt;h3&gt;Arch Linux&lt;/h3&gt;
&lt;p&gt;Arch has been my daily driver since 2019. Before that, I&#x27;ve used Linux Mint and
Ubuntu at work; back when I didn&#x27;t care and didn&#x27;t know the differences between
all the flavours of Linux. I have also had to use MacOS for a job where the
company mandated developers to use Apple machines.&lt;/p&gt;
&lt;p&gt;I have tried other different flavours of Linux on virtual machines in the past,
but I decided to stick with Arch Linux given how simple it is to customise.&lt;/p&gt;
&lt;p&gt;That means I can use my simple bash script to download and auto-configure my
system without manual intervention. I can also sync my environment between my
work laptop and my personal laptop with one command. Apart from the hardware,
all my machines are identical from a user&#x27;s experience.&lt;/p&gt;
&lt;p&gt;An basic understanding of Linux is necessary to use this distro. You don&#x27;t need
much more than being comfortable around the terminal these days. This is
because everything you need to learn is already in the arch wiki and arch now
has assisted installation scripts via &lt;code&gt;archinstall&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Some other perks of using arch are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rolling releases&lt;/li&gt;
&lt;li&gt;Rich user repository (AUR)&lt;/li&gt;
&lt;li&gt;The fantastic Arch wiki&lt;/li&gt;
&lt;li&gt;No corporations behind it (community support only)&lt;/li&gt;
&lt;li&gt;Helpful community&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;i3-gaps&lt;/h3&gt;
&lt;p&gt;i3-gaps is a fork of the i3wm (tilling window manager) for X11. Instead of
having stacked windows that overlap (like in microsoft windows, or macOs),
windows are organized side-by-side as default, having gaps between them.
The benefits are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Able to customise keyboard shortcuts to navigate through windows.&lt;/li&gt;
&lt;li&gt;Easy to setup.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;polybar&lt;/h3&gt;
&lt;p&gt;Polybar is was the easiest and most user-friendly status bar I could find. With
a lot of pre-configured setups and out-of-the-box integrations, getting up to
speed was very simple.&lt;/p&gt;
&lt;h3&gt;rofi&lt;/h3&gt;
&lt;p&gt;A very simple and configurable application/script launcher.&lt;/p&gt;
&lt;h3&gt;pywal&lt;/h3&gt;
&lt;p&gt;For setting colours and themes.&lt;/p&gt;
&lt;h2&gt;More about Ricing&lt;/h2&gt;
&lt;p&gt;When it comes to finding inspiration for ricing in Linux, a good place to look
at is &lt;a href=&quot;https://www.reddit.com/r/unixporn/&quot;&gt;/r/unixporn&lt;/a&gt;. Most of my setup came
from picking apart different rices that users have shared in that channel. It
is also a good place to visit once in a while to stay up-to-date with what the
rest of the community is using and trying.&lt;/p&gt;
&lt;script async src=&quot;https://scripts.simpleanalyticscdn.com/latest.js&quot;&gt;&lt;/script&gt;</content><published>2020-08-30T00:00:00Z</published><updated>2024-07-06T00:00:00Z</updated></entry>
</feed>