What is Lillero?

Lillero is a lightweight and simple Java ASM patching framework built on top of ObjectWeb's ASM library. It can be used in conjunction with any loader that supports the ASM library's ClassVisitor system.

Lillero is made up of multiple components:

On top of these, there's Lillero-loader, a sample loader, in form of a plugin for Minecraft Forge's ModLauncher. Lillero-loader is the only Minecraft-specific part of the Lillero system. To reiterate: while Lillero was initially developed for use with Minecraft, it really works with anything, as long as you have a loader that supports the ASM library.

This book will introduce you to ASM patching and provide an in-depth guide on how to use Lillero to do it in a way that is flexible and yet comfortable.

An introduction to ASM patching

"ASM patching" means, in short, to modify the "ASM" - that is, "assembly" - of an application at runtime. In context, this refers to Java bytecode, which is the instruction set of the JVM. Java, Kotlin, Scala, Groovy and any other language targeting the JVM can, once compiled, be seen as bytecode.

Since you are modifying how the application works and have no guarantees of being the only one doing so, caution is paramount when working with ASM patches. Ask yourself: do I really need a patch to achieve this? Can I not go around it? Does doing so imply a performance loss? How big of a performance loss? Is it bearable?

In short, ASM patching should always be the very last resort. That is not to say that patching is useless: there are many problems that can only be effectively solved by modifying the bytecode of the target application; though, you should keep in mind that most problems can be effectively solved by other means.

Though reviled by many, ASM patching remains one of the most powerful tools in the Java modder's arsenal. Like every tool, ASM patching is not evil in itself. When used correctly, it can solve just about any problem elegantly with a minuscule footprint. When done incorrectly, it can wreak havoc on the entire environment, causing inexplicable crashes and pulling the rug from underneath everyone else wishing to modify the program just like you.

This latter issue has led most of the new generation of modders to reject ASM patching altogether, in favour of higher-level solutions, ditching the complexities of bytecode in favour of the checked safety of plain Java. In Minecraft's case, one such solution is Mixin.

Why (not) Mixin?

Mixin is a bytecode manipulation framework that has become very popular in recent years. Though it also relies on the ASM library, Mixin is not an "ASM patching" framework in the true meaning of the word. Self-described as a "bytecode-weaving" framework, it allows the user to manipulate the bytecode without having to manually write a single instruction.

The user of Mixin will be writing in Java (or any other JVM language), rather than raw bytecode instructions, using annotations to provide any metadata (such as location) your bytecode might need. Working with Mixin is undeniably easier: you're trading the surgical precision of ASM patching for safety and comfort. Mixin tries to provide ways to achieve most things patching can do: as a result, it has become huge - some would say bloated - and in spite of that its replacements are clunky and impractical due to the high amount of abstraction needed.

Suppose, for example, that you wish to modify the conditions of an if() statement in some way: with raw patching, since ifs are compiled down to conditional jump instructions, this is a trivial task, arguably one of the easiest you can face. With Mixin, you'll likely be duplicating and overwriting half the method: all the fancy crutches Mixin has given you now are just getting in your way.

Myths

A widespread myth is that Mixin "allows for greater compatibility" with other mods that work to modify the same part of the code. This is is a half-truth at best. Poorly written Mixins can break compatibility as much as any bad ASM patch; conversely, properly made Mixins will work just as well as properly written ASM patches.

The main reason people say this is that the worst Mixin (one that @Overwrites methods when it really isn't needed) is better than the worst ASM patch (one that injects its bytecode in the wrong spot): the former will "simply" erase any changes made by others, while the latter will crash your program in the best case, and cause weird undetectable behaviour in the worst. What I just said is an undeniable truth; it's also true that the best ASM patch is, depending on the task, equal to or better than the best Mixin, due to its superior precision and overall lower impact on the resulting code. Now, knowing this, ask yourself: are you aiming to write the best, or the worst?

Upsides

The one upside Mixin truly has is that it's stricter: it performs a number of checks to ensure the validity of what you wrote, and since you're writing plain Java (or whatever other language), the compiler will also check the validity of your code. You have no such safety net in raw ASM.

Finally, as I mentioned, Mixin is a rather big library; while most Minecraft mod loaders nowadays bundle it (which is a questionable design choice, but that's a topic for another time), this is not the case in other environments. In many cases, I've seen Mixin binaries bigger than the programs they were supposed to be backing.

Conclusion

Ultimately, whether to use Mixin or ASM patching is a matter of personal preference. Lots of great programmers choose not to bother with the complexities of bytecode and instead entrust that part to Mixin, and lots of incompetent programmers try and fail to do it manually, creating the botched patches that sparked this whole debate. Unfortunately, the latter category has given a terrible reputation to ASM patching. The purpose of this chapter is to disprove such myths, and show that ASM patching can be an effective alternative to high-level frameworks.

Why Lillero?

As you may have gleamed from the previous chapter, I am not a fan of Mixin. I respect its engineering, which is very clever, and acknowledge the problems it attempts to solve. My issue with it is that most of those problems are symptoms of a bigger one that Mixin fails to acknowledge.

The problem, the solution

Why do people fail at making patches? The answer is lack of checks mixed with general incompetence. Mixin thus set out to make it easy. My belief, though, is that the underlying issue is a general lack of readily available information on ASM patching. The Minecraft Forge forums soon banned discussion of the topic altogether, in a misguided attempt to discourage it. Should we be surprised that people are doing it wrong, if you can't talk about the topic in one of the biggest communities that may be interested in it?

Lillero was my alternative answer to those problems. I wrote Lillero with a clear goal in mind: it should allow you to do everything, while keeping it as comfortable as it can get this close to bare metal. When used to its full potential, Lillero is lightweight and flexible, but also easy to write. Coupled with this book, it should empower anyone to write good patches following the best possible practices.

Design

At the heart of Lillero lies a Java interface, which any aspiring patch should implement: it will contain various methods, providing any metadata that may be needed by the loader as well as the one where the patching will happen. As we'll see, you won't have to write most of this boilerplate by hand: the Lillero-processor will take care of generating it.

Generating is the keyword here: repetitive tasks aren't abstracted out, they are just made to write by the machine. One can open the generated files and easily see what each annotation does. By design, Lillero's inner workings should be clear and easy to follow for anyone wishing to learn. Should one want to dig deeper, they'll find that all code in the Lillero project is heavily documented, with a Javadoc for every last method and field, so that everything is perfectly clear to anyone wishing to learn from it.

Patching

Since you are applying changes to the bytecode of a class, this must necessarily happen before said class is loaded in memory. The component that applies said changes is called a loader; don't concern yourself on the inner workings of loaders for now, just know that they are in charge of the initial step: we'll cover them in detail in their own chapter.

Suppose that you already have a working loader in place. This loader calls your injector method, and passes it a ClassNode and a MethodNode as arguments, representing respectively the container class and the method you're targeting. This is the most common type of ASM patching, and it's probably why you're here; more advanced subjects may be covered in additional chapters later on.

At a glance, this might seem restrictive. However, do keep in mind that even code outside of methods - in field declarations, in loose blocks, or in static blocks - is actually considered to be part of a method by the compiler. Specifically, the constructor (<init>) for instance fields and loose blocks, and the static constructor (<clinit>) for static fields and static blocks.

Bytecode

Before we get into the specifics of bytecode manipulation, you should understand what exactly you will be dealing with. Patching essentially consists in modifying the bytecode of a class. If you're familiar with any flavour of assembly language, this will all look very familiar.

Essentially, any programming language targeting the JVM (short for Java Virtual Machine) will be convereted by its compiler into machine code. Except that the machine code isn't going to be the one of your computer, as it happens with other programming languages: it will be the machine code of the JVM since it will be the one running your program anyway.

Java bytecode is a human-readable representation of the machine code that the JVM is meant to interpret. With the right tools, it can be manipulated to change the behaviour of a program - which brings us here. Java bytecode is relatively high-level when compared to its native counterpart, including support for more abstract concepts like classes and inheritance, but still requires a way of thinking much closer to the functioning of a machine than what is needed for regular programming.

Bytecode instructions are made up of various parts; first comes the opcode, a numerical ID (though you work with human-readable aliases for these numbers) then come a number of arguments which may vary depending on the opcode.

Stack-oriented programming

If you've ever attended any formal programming course, you'll be certainly familiar with the concepts of stack and heap. While on Java they'll at most be an occasional passing thought, when dealing with bytecode they become central.

The stack is a quickly-accessible memory region that follows the rule first in, last out. It's often compared to a stack of plates: you can only ever add (push) new plates on the top, and can only ever take (pop) the one on the very top. It's highly efficient, but anything that gets put on the stack must have a known memory size at compile time. This makes it suitable for working with primitives, but not quite as much for objects. Those follow different rules.

Objects are stored on the heap, and only a reference to their memory region - a map of sorts to find where their data is located - is pushed onto the stack. The heap is a messier, but bigger place: it's slower, but it allows retrieval of values from any point and doesn't need to know in advance the size of everything.

Most bytecode instructions affect the stack in some way, either by taking its arguments from it or by pushing the result of the operation onto it.

Nodes

The ASM library represents sequences of bytecode as doubly linked lists, with the InsnList type. Lillero provides an extended functionality

Each instruction is a node, represented by various subclasses of AbstractInsnNode; each node contains an opcode, a number of parameters depending on the opcode type, and references to the preceding and following nodes.

The InsnList representing the method's nodes is MethodNode's instructions field. You can perform all operations you'd expect: append, insert, remove, etcetera. You should aim to leave the smallest possible footprint on the method, so removing nodes is almost always a bad idea. You can achieve the same result by jumping over the part you wish to remove.

We'll now check out the various types of instruction nodes; you can find a detailed list of opcodes, with explanations, both on this Wikipedia page and on the Java SE Specifications.