The problem with your thinking is that you subtracted the LED voltage from the supply voltage to get the remaining voltage that must be dropped by a resistor. Normally, that thinking process is valid, assuming of course that there's only a resistor and an LED.
In your case, however, there are two transistors in series with the rest, as well.
Even when operated as switches, bipolar transistors still have a small voltage drop across them -- as little as a few tens of millivolts each to as much as a few hundred millivolts each. The exact value is quite complicated to calculate. (It can be calculated -- and is calculated by Spice programs -- it just involves a lot of detail and many equations and a few assumptions, such as operating temperature.)
And even if you do calculate it, a remaining problem will be that when you buy actual parts they vary so much between them that the measured practical results will be different than what was calculated. The parameter values for the actual parts will first need to be measured (another set of complicated procedures) and then entered into the Spice simulator so that the simulator will use the measured parameter values when making computations. Only then will things match up between the simulator and the physical parts being used. And don't forget, both transistors would need to be measured for its parameters. Just measuring one doesn't tell you about the other one. So Spice needs the special parameters for each transistor to get things right.
Since that's the real world we live in, designers know in advance that they will not get the exact same values a simulator tells them. And they make designs that don't rely on very specific results. Instead, they take worst case values and worst case ranges of values and try to make a circuit design that will still provide acceptable results.
(That's not always possible to do with discrete parts. For example, a differential amplifier -- also known as a long-tailed pair -- will be mostly useless on a breadboard with real parts but will work perfectly in a simulator. For another example, a current mirror will similarly work perfectly in a simulator and terribly when you build it with discrete parts.)
So what to do? In a case like this, the usual method is to guess (an educated guess) about the typical case, plus a little more to hedge against 'worst case'. With only one transistor, a designer might purposely assume worst case. But with two, the odds are a little better that a typical number, with some hedging, will be fine.
A datasheet can also be consulted. There are huge differences between power transistors and small signal transistors. And there's a variety of other reasons that one should consult a datasheet. If for no other reason, than to verify one's memory. So let's look at the OnSemi datasheet for the 2N3904:
At bottom there you can see only worst case values listed. And \$50\:\text{mA}\$ is the highest value, too. So perhaps this isn't the best part to be using here!! But let's look at this chart:
It's a typical
chart. Not a maximum or minimum. And it only applies when the device is operating at \$25^\circ\$C.
Note also that your base current is pretty low. I've circled it in green and provided an arrow pointing to where it probably should be at to get lower voltage drops at such high currents.
Even given that you get the base current right, here you can see that at \$100\:\text{mA}\$ (they kindly provided that curve) the saturation voltage is somewhere between about \$300\:\text{mV}\$ and \$400\:\text{mV}\$. (Whether or not Spice simulation gives that value is yet another question, of course.) Note that this range exceeds the first table I showed. But that was spec'd at \$50\:\text{mA}\$. So we should expect worse when considering twice that current and more.
So I may use \$350\:\text{mV}\$ per transistor, or \$700\:\text{mV}\$ for both. This means I would calculate \$R=\frac{5\:\text{V}-2.1\:\text{V}-2\cdot 350\:\text{mV}}{120\:\text{mA}}=18\frac13\:\Omega\$.
That gives me a bead on it. But not done yet. Suppose the two transistors are less than typical? I wouldn't want to exceed the maximum LED current. So I would round the resistor value up, and not down, because that's safer. I'd use \$22\:\Omega\$.
However, this may mean I get only \$100\:\text{mA}\$, or less. Suppose the transistors operate worst case -- or I don't supply enough base current (and you don't) -- and I get \$400\:\text{mV}\$ each. Then I'd get \$95\:\text{mA}\$. If \$95\:\text{mA}\$ to \$120\:\text{mA}\$ is acceptable for the application, then fine.
If not? Then a current source/sink using may be required. And there's enough voltage overhead to apply one that requires two more transistors. So that would be the next step, if the above "regulation" wasn't good enough. (Resistors with low/small voltage drops across them make terrible LED current regulators because small voltage variations between reality and paper calculations can have large effects.)
Or use MOSFETs as the next step, which can have smaller voltage drops across them.
Another detail is that the resistor should probably be 1 W. That's a big resistor.
And another detail is that I don't like that bipolar AND gate. But for learning purposes and if you are only using one of them, I suppose it is tolerable.