Skip to main content

Codex Audentia

Codex: An ancient manuscript text in book form.
Audentia: Latin for “audacity”.

This is my codex — a working notebook with my notes, experiments, and rambles in their full glory. It is raw, unpolished and unfiltered.

This is not a blog.

You can subscribe to these posts here.

I’m building a 1,000 year company, and writing about the process.

Defining the MVP of DenseLayers

By DenseLayers, Rambles No Comments

DenseLayers is a website where people can discuss research papers online, paragraph by paragraph. It is the project I am working on right now, and aim to launch the site by Dec 31. The story of how it came about etc is quite long, but essentially it started with me explaining the AlphaGo paper on Medium. I’ll skip to a more pressing matter – what features should be in the first version of the website. As always, don’t mind if this turns into a rambling post towards the end.

***

As per my philosophy, my first version should have the following qualities:

  1. It should solve a pressing problem at least 4x better than what people do currently.
  2. Doesn’t matter how much friction there is, as long as it solves that problem better than the alternatives.
  3. It’s a proof of concept – a “test”. As a Systems Engineer, I’ve been taught to believe that a good test tells you not just what’s wrong, but also what to do. This means there should be some way to collect feedback (=data) from the MVP.
  4. It should be as intimate and unscalable as possible without becoming a pain in the ass for me. I want to be as close to the website’s users as possible, but don’t want to be overwhelmed. So I need to decide what should be automated and what shouldn’t.

So here it is: the list of features in V1.

Front Page

This page doesn’t need to be flashy. Understanding DenseLayers’ purpose is not rocket science, so I don’t need to put effort into explaining something that’s obvious. For nostalgia, here’s how the site will look in its first iteration (this is a screenshot of what I’ve actually coded up built).

front page first v1

 

1. The front page should have a list of all papers currently on the website, ranked in some way. These list items should have some info about the paper, as well as a link to the full paper’s place on the website.
2. Navigation links at the top of the page: About and Site Rules are a must. Also Login/Sign up.

User accounts

Users can sign up, login, log out, remember password, forget password etc. I’ll talk about user anonymity and privacy later. Users shall also have a profile page where they can have a little bio and contact etc. Nostalgia:

login

sign up page v1

 

Single paper display

1. Users can read the whole paper, broken into individual paragraphs that I’ll call “fragments”.
2. Users can add and delete their own explanations for a particular fragment. Each fragment will have its own comment thread that opens when you click it.
3. Users can also leave comments on the overall paper (not just individual paragraphs). (I’ve decided it’s not critical)
4. Users CANNOT comment on someone else’s post/explanation. No need. I don’t want good thoughts to be buried inside long nested comments. The only thing you can do is write your own post.
5. Don’t display users’ name on the paper page (!!). Give users a random temporary name or a pixel art thumbnail. Users can only see their own names.
6. Users’ profile pages can be opened however, if someone clicks on their pixel art/random name.
7. Users can vote on paper fragments that are particularly unclear and need more discussion/explanation. This will help direct the community towards parts of the paper that are particularly unclear, thus stimulating more productive discussions (“bang for the buck”).
8. Users CANNOT tag or “@mention” other users in their explanations or comments.

Here’s a wireframe of what it might look like. Interestingly, I made these wireframes more than a year ago and now finally sat down to build the damn thing!

paper 2

Data Collection and Metrics

I think it was Peter Drucker who said that what gets measured, gets improved. But I don’t yet know what overall metrics I want to be looking at – at this stage, the purpose is not “growth”, but rather having a site that I can show to users and get in-person feedback by watching how they use it, so that I can improve the product. So I’ll only spend the bare minimum of my time on setting up vanity metrics, at the acceptable risk of not being data-driven enough. I can always add all those analytics packages etc later.

1. Definitely collect information about how many views a paper has.
2. Also collect some user visit data. Now, what should count as a “visit”? I’m interested in seeing how “active” the site is, and at what times the website is most active. So essentially, every time someone loads a page on the site, I’ll consider it a “visit”. It can always change later. Interestingly, this definition did change later :D. I am now defining a “visit” as a user visiting a paper on the site, and not just any page. The only visits that are important are the ones when people are reading papers!

Spam

Steve Huffman to the rescue! His lectures on spam were a lifesaver, without which I’d be totally lost. The key idea is that just by being familiar with the motivations and behavior of spammers, you can fix a LOT of things without technical wizardry. So here’s what we’re gonna do.

  1. Check a time difference between a person’s consecutive posts. If a person is posting new stuff within less than 2 minutes after the previous one, it’s likely that either they are posting ads, or they didn’t put much thought into what they are saying. I don’t want DenseLayers to be a place where people talk mindlessly. So if you post within 2 minutes of your last post, I’ll stop you there and ask you to slow down.
  2. If someone includes a link within their comment, I’ll add an attribute rel=nofollow so that search engines don’t follow the link.
  3. Every time someone shows spammy behaviour (posting too quickly, or being flagged by the community), increase their account’s spammy count (this score will be invisible of course). Keep a threshold above which a user is certainly spammy. If someone is very spammy, log them out of their account and ask them to contact me personally. In future, I might not even let them know that they’ve been flagged – I’d just restrict their ability to post stuff on the site. It’s called “security through obscurity”.

Internationalization, Localization, Accessibility

The very fundamental premise of DenseLayers is to open up access to scientific discussion for everyone, all around the world. So it goes without saying that people should be able to write comments and explanations in the language they prefer. In the first version of the website, I won’t localize it to different languages. However, people can add comments in whatever language they prefer – I’ll try to make sure the text is in Unicode and doesn’t break on different browsers and the database stores their words without corruption. There will be no other accommodation, I’m sorry but I have limited time.

As for accessibility by people who have disabilities, I cannot really include many features but I will try to build the front-end using best practices that make it less painful for them.

I’d also love people with poor internet connections to use my site. The paper pages will be very heavy on images, but I will try to reduce the resolution to an optimal number, so that the file sizes are as small as possible (that helps with data storage costs as well, so it’s a win-win). I’m also flirting with the idea of not using any front-end frameworks, and instead just creating my own mini framework to use for V1. That should make the site leaner and simpler (although for future versions of the site, it’s a trade-off. I might give in to learning and using a popular framework like React or Vue etc).

Critical Decision: The front-end won’t be super responsive. At this stage I don’t need people to use Denselayers on their phones on the subway. I’d prefer users using it on a big screen and a proper keyboard.

Anonymity and seating at the table

Now comes a sensitive topic that I’m sure people will have differing opinions about when the site launches. Let me document my thought process here. The core premise of the website is openness. DenseLayers will be a judgement-free, hater-free and crap-free space. I’ve also decided that I want ideas to be given weight based on merit, and not the person’s credentials. This means that if a college freshman has an interesting point, it should be taken seriously. And if a professor emeritus says something stupid, it shouldn’t be accepted just because of authority. I believe that since the research community is tight knit, if you’re scrolling down a paper on DenseLayers and see 10 people leave comments here and there, you’ll probably already know 7 of them personally and 3 will likely from your own department. In such a scenario, it’s easy to disregard someone’s comments simply because they haven’t “earned a seat at the table”.

Let me be clear: nothing would piss me off more than that. So I’ve decided that even though users can have their own profile pages, their names won’t show next to the things they write. I’ll give each comment a different random pixel art thumbnail and that’s it. If a user does want to see the person behind a comment, they can click on the thumbnail and see the person’s profile page where they can see the person’s identity. This may sound self-defeating but it is not. I want people to be recognized, but only if their comments are particularly intriguing in some way. My hypothesis is that the friction will create a healthy balance between meritocracy and anarchy.

Alas, this means someone who just wants to troll, is also welcome on the site. So I’ll have to add other mechanisms to the site to keep trolls in check. In the future, every comment will have options that allow people to flag them for spam or douchebaggery. Too many offences, and a user is ostracized from the community. Unfortunately this feature will not be present in V1.

More importantly, culture always starts with the founder. To guide the culture on the website, I need to take the reigns in my own hands and show the example for what kind of comments I want people to leave. For V1, I will strive to be the foremost contributor on the website.

Another note on anonymity

Well, I was going to talk about a different feature that I planned to introduce later, but in this post I’ll keep it to myself. The gist is that a user who has particularly strong reasons to be anonymous (whistleblower situation), should be able to choose ONE paper per month that they will be completely anonymous on, so nobody can see their identity or visit their profile even if they click on the user’s thumbnail image.

Monetization?

Probably nothing, but a separate page where I list my favorite books (with honestly-shown Amazon referral links) might pop up later.

Technical Choices

Using Python/Flask for the backend, and using Flask plugins for almost everything. Mini front-end framework made by myself. Thinking of storing paper fragments on Amazon S3 and deploying website itself on either AWS Lightsail or Heroku.

How do I describe my database schema?

By DenseLayers, Rambles No Comments

This is a post in which I explore what would be the best way to explain my current database schema for DenseLayers (my current project) in a clear and concise manner – which I will do in another post. This means that this is a rambling post about how I should write that future post. Writing great documentation (whether for yourself or others) is a skill and it should be practiced thoughtfully. I want to compare different ways of organizing/structuring the ideas on paper so that they make the most sense.

Update: I realize other readers may not know what DenseLayers is because I haven’t introduced it yet. It’s a website where people can discuss research papers online, paragraph-by-paragraph. I believe that having to read a paper alone by yourself is a terrible waste of time. Reading research should be collaborative.

I guess the best way is to start with the goal in mind – by the end of reading that future post, I want the following from the readers:

  1. I want them to know what high-level decisions I’ve made about the website and its functionality.
  2. I want them to see how I translated those decisions into a database architecture.
  3. I want them to see what decisions I’ve made regarding privacy and anonymity, and how they too translate into the database schema.

Yeah I think that makes sense. Let’s try a few different ways to organize explain these ideas in a cohesive manner.

Okay, firstly readers need to know that what I’m doing right now is just the first version of the website, which is an MVP and not meant to be perfect. So most of the features that I envision in the future will not be there. Cool. Decision: I shall write a couple lines in the beginning about how this is just the first version.

Then, I should describe the features of V1, including the reasons for why. So maybe describe them as…

  1. Front page – what it will contain
  2. Paper page – what the paper page will look like, what features it will have?
  3. User sign up and login, password reset etc
  4. User profile and settings
  5. A form that I can use to quickly add new papers to the website – should only have admin access. How would this form work?
  6. How does user data etc get displayed on the paper page? (It not very simple)
  7. Language flexibility – people should be able to use the site in any language they want
  8. What kind of analytics and user activity I will collect from the site.
  9. Should there be a spam filter already? (Another decision to make!)

Which basically seems like describing the wireframe I drew over a year ago. Maybe I should include screens from the wireframe.

I think if I split the website functionality like this, it makes sense. Not too deep nor too shallow. So, I can now show readers my database schema and teach them about it?

Also it would obviously be worth mentioning that I’m using a SQL database – maybe that should be the first thing? I guess the readers who read that future post will be those who have a technical background and are interested in learning about my database schema.

No wait – I can’t just assume maybe I should create a short list of personas of readers, split by technical ability, and use that to drive the decision?

  1. People who don’t know anything about database schemas. These people likely are just into the website that I’m actually building, and happen to find the blog post codex?
  2. People who know about database schemas, and find that codex entry many months/years later when they’re curious about how it was built. <– I guess for these people, the decisions that govern the schema design are more interesting than the schema itself. But I guess we’ve already taken care of that in the proposed structure above. So we’re good here.
  3. People who are very actively involved in the building of this website and want to have a clear picture of why anything is done. Wait, is that just me? In any case, this section of the readership is actually most important – because the most frequent reader of this codex will be myself. Decision: anything that Present Me writes should be useful to Future Me.

Future Me, you better be grateful to me for making decisions in your best interests.

Anyway – It seems that #1 is not a useful section of the audience to structure my post about. I’d rather focus on #2 and #3, and… wow, that brings us back to ground zero. Looks like I learned nothing from this whole “audience persona” mental exercise. So from this exercise of creating audience personas, the only thing I’ve learned is that everything I write should be useful to my future self if he ever needs to go back in time and look at his my decision making process.

Letting that aside, I don’t know the big advantages of SQL vs noSQL, do I need to talk about that? I pretty much went with SQL because it works perfectly for what I’m trying to do. That reminds me, I should read more blogs and watch some videos about noSQL because I barely know anything about it. Awesome, cheers!

Wait, what would be a good way to sign off at the end? Now this is one place where I should probably use something that is appropriate to a larger audience and not just Future Me. How about a few contenders:

1. Awesome, cheers! <– sounds weird-ish
2. Thanks for reading! <– too formal?
3. Cheerio! <– this one is pleasant, I’ve used it in emails a lot
4. That’s all, folks! <– can’t believe my brain farted this one. Cringe
5. See y’all later! <– same as #4
6. …nothing? Just end at the last logical sentence? <– sounds very flexible actually

So it’s a tie between #3 and #6. Let’s just go with #6 for now. After all I can always suddenly change to whatever I feel like. Wow, I just spent another 10 minutes overthinking something so trivial.

Let’s be terrible together.

By Reflections No Comments

If you’ve seen my front page, you know that I love writing, teaching and sharing ideas.

This blog website shall be my codex; my personal manuscript, inspired by Leonardo da Vinci’s Codex Atlanticus. The Atlanticus is a twelve-volume series of handwritten notebooks in which he documented his ideas and studies on a great variety of subjects, from botany to weaponry. These private scribbles give us a peek into the brain of a prolific man, and have allowed us to recognize his endeavours although half a millenium has passed since.

Crossbow design, sketched by Leonardo da Vinci Leonardo da Vinci's self portrait

Giant crossbow design by Da Vinci; and his self portrait

I too wish to be a documentarian of my own work, but a little differently – the Codex Audentia shall be open to the public from day one. This is where the word “audentia” comes in (Latin for audacity). You get indefinite backstage access. I will share my work in progress, my ideas and my “scribbles” – be it good or bad.

I don’t really intend for this to be a “blog”. Calling it a blog would cause me to be self-critical and judge my writings too harshly, and care too much about my “personal brand”. I don’t want that pressure. This codex is meant to capture what goes on in my brain, and my brain is very chaotic. A lot of my work will be terrible, and that’s fine. Let’s be terrible together.

‘You are remembered for the rules you break.’ – General Douglas MacArthur

 

I challenged myself to get a black belt in Judo in 12 months, training at the Kodokan in Tokyo.

I challenged myself to achieve fluency in Japanese in 12 months. The result blew me away.

Designed by

best down free | web phu nu so | toc dep 2017