Notes: Reasonable defaults
From the video Solving the Right Problems for Engine Programmers, by Mike Acton.
The video is split into 3 general areas, however these notes cover Reasonable defaults, and Problem solving.
Trusty tools and techniques to reach for as a first pass.
- Linear search through array
- Fast, simple structure. Easy to find things, generally quick to traverse. Once it becomes slow, pick a data structure more complex.
- FIFO queue managed by an incrementing integer
- Good data structure for concurrency and buffered things, most of the time all thats needed for managing concurrency.
- Store things by type
- Rather than storing the type within things. Organize and sort things by type. The type can then be implied.
- Multiple by default
- Solve for the statistically most common scenario first. For example a game has hundreds/thousands of entities, they all need to update. Rather than thinking each is a single entity updating themselves, handle updating multiple entities.
- Explicit latency and throughput constraints
- Understand a clear realistic measurement of how long your system should take to return. Understand how much data your system needs to process at a single time.
- Version serialized data
- Version any kind of serialized data, data shapes change, the version number allows two systems to align to the interface of that data shape.
- Memory Allocation: block, stack, scratch
- Probably the only allocators you will ever need to use. More details here.
- Model the target manually first
- Understand the shape of the output, cheat, shape it manually if possible, as quickly as possible, before building a system to automate that output, that way you know for sure if your system is broken or not. Write tests to ensure the output is correct.
- Index look aside table
- Having a data structure that can tell you where data structures are. Think similar to a database index.
How to know if you're solving the right problem.
These details are always changing, being refined, write them down.
It's the things we can assume about the problem.
The more context you have the better you can make the solution.
The problem is best solved with context, rather than being generic and trying to fit into any context.
Creating generic things forces someone else to bring them all together, you aren't solving the original problem, you are pushing the problem down the road.
Different problems require different solutions.
If the problem changes, the solution also changes.
Things we can ask.
What are the users needs?
What data do they need to transform and work with?
What are their concrete goals? Use plain simple descriptions. (don't list features).
What are the constraints?
- Iteration time, how long can something take before its no longer useful?
- Size constraint, how big is the thing allowed to be? 1MB? 50GB? There is always a limit.
- Speed, how long can the system take at runtime? How long is an acceptable build time?
- Correctness, which parts must be more correct than other parts? There will always be bugs, which parts should have less bugs?
Common context traps
1. The "what if" game
What if it needs to do X? Question, does it actually have to do X? If so, that's important, lets write it down. What are the odds of that actually happening?
Is there a concrete example of that thing happening?
Is there a way to test for it happening?
How much experience do you/we have with this problem?
Solving problems you don't have, creates problems you definitely do have.
Future proofing is foolish goal, different problems require different solutions. Things will always change.
What is the lifespan of the system? 1 year, 10 years?
2. Trying to conceptually over simplify
The problem is as complicated as the problem.
Hardware presents real limits, how fast is the network? How fast is the device? Reality can't be avoided.
Concepts and abstractions may not fit real world scenarios.
Spend time to analyse the input and output data required. Where the input data is coming from? How fast? etc.
3. Over complicating
Trying to make something too generic.
Trying to solve the problem by telling a story around made up concepts, e.g objects, hiarachical relations, this calls this, what happens when reality gets in the way of the story? The story isn't the problem you're solving, transforming data is the problem you're solving. Be careful about falling into OOP traps.
Using generic as an excuse to not solve the problem, pushing it further down the road.
Using generic to not talk to and gather real user requirements.
How much is it worth to solve this problem? 2 weeks of time? 6 weeks? 2 years? What if you had 1/6th of the time? What's the fastest possible way to achieve the output? Do that first (plan B), then spend the time making it better.
Understanding why, what is the business case?
Will it reduce the cost of entry? By how much?
Is it something consumers are expecting?
Will it reduce development time? By how much?
Does it enable exploration? e.g We need to learn about something.
Does it reduce size? By how much?
Does it improve speed? By how much?
Predicted dev time cost (will generally be wrong).
Opportunity cost (better vs new) e.g time you can't spend on other things.
Maintenance cost (the real cost) Probability of bugs based on dependencies. e.g
- Maintaining an understanding of the data
- Changing requirements
- Communicating constraints
- Untested transforms
- Unexpected use cases
- Dependency changes
- Bad inputs
- Usage training
- Any changes whatsoever
Build vs Buy? How well can you reason about Value/Cost?
Understand the realities of the platform you are building on.
- The desktop device
- The mobile device
- The network latency
- The network type (fibre, 2G, 3G)
- The server hardware
- The browsers
- The browser engines
Understand your tools
Compilers, transpilers, transformers. How do they work?
Tools that get in the way of understanding how they work are bad tools.
Everything is a data problem.
The purpose of all programs is to transform one kind of data into another kind.
What are the limits of the data?
How many objects / entities are there?
What kind of range of data is there?
How likely are certain instances of data going to appear statistically?
How likely over time? Does it change?
How likely is it grouped statistically?
Sample inputs and outputs.
What does it look like? Have examples.
What is read and written over time? What are the access patterns?
When is data accessed statistically relative to other data?
What data values are common? Outliers?
What data ranges are common? Outliers?
What data causes branches? Look at your state machine.
Getting the clowns out of the car
You have a car full of clowns, how do you make the car faster?
First step, get the clowns out of the car.
Find things that don't need to be done, and don't do them. This isn't premature optimization, its just skipping unnecessary things.