You must login before you can post a comment.
Author: Sanderokian Stfetoneri
Forked from: Sanderokian Stfetoneri/Femto-4v2.6 (Computer)
Project access type: Public
Description:
Latest versions of the 256-Series, including the Femto-4:
https://circuitverse.org/users/4699/projects/256-series
A 16-bit computer/maybe console inspired thing, the Femto-4. This is a side branch to keep a functional version around.
Currently runs:
Cart A: Flappy Bird
Cart B: Some Pixel Art
Cart C: Screensaver
Cart D: Snake
Bundle Cart: All carts in one
Assembler:
https://repl.it/@Sanderokianstfe/Femto-4-Assembler#DeveloperGuide.txt
The 256-Series:
https://circuitverse.org/users/4699/projects/256-series
Features:
Immediate, direct and indirect memory access
Jumps and conditional Jumps
16-bit address space
Switchable Memory Banks, allowing for a standard cart to hold up to 512kx16b of data
An ALU capable of logical operators, addition, subtraction, shift left, shift right, multiplying, dividing, and other specialised functions
Easy to add to buses
"Fast Execution" - Can run more than one instruction per clock cycle
"Faster Execution" - Runs instructions on both edges of the clock pulse
16x16 pixel display with 32 "Sprites" and 15-bit direct colour
Inputs, both "controllers" and keyboards
Random number generator
Text outputs
Stack
Von Neumann Architecture
Assembler (written in an external program)
Save memory
Four pre-written carts to play with
Previous Updates:
Fixed code controlled graphics updates
Made Bootloader clear TTY, Keyboard, and Controller Pushed
Fixed Register ALU instructions
Updated Cart A and Cart B to make use of the Register ALU instructions
Updated the Cart B to respond to the start button on both controllers
Moved to new project to fix issues around searching for projects branched from private projects
Removed unnecessary EEPROM banks from all carts
Removed unnecessary write lines leading to EEPROMs in carts, preventing code from being overwritten during execution
Made Reset clear WRAM and the General Registers
Fixed Keyboard
Added a Bundle Cart that allows you to view all the carts I have made without changing carts (you must reset the console to view another cart)
Fixed bug in standard bank design which wrote data to incorrect addresses
Fixed contention issue with multiplying
Added Annotations to the In Debug
Added Snake Player
Added Reset and Power Labels to the relevant buttons
Further optimisation to reduce lag/increase execution speed
Added more memory access options
Further optimisation of the CU
Added a keyboard to controller mapping
Updates:
Continued optimisation and overhaul of the CU
Removed old CU subcircuits
Removed old compare subcircuit
Added additional stack access instructions
Updated the debug versions with the changes, as well as fixing bugs in the debug versions
Designed a Logo for the Femto-4
Added an additional sprite for the logo
Rewrote the pixel art cart to allow the sprites to be viewed out of order
Future Updates:
More pre-written carts
Bug fixes
Compiler
Do fork the project and write your own code for it! If you want more information on how to do so read the Developer Guide in the assembler.
Note:
The Flappy Bird high score and the Snake high score are mine. If you want to save your own scores permanently you will have to fork the project.
The Femto-4
General Architecture:
The Femto-4 is a 16-bit computer with variable length instructions that are comprised of multiple 16-bit words. First the OP Code of the instruction is read, and then depending on the OP Code, additional pieces of data may be read for the operands. This allows execution to become incorrectly offset, which can lead to the execution of garbage if the PC is jumped to an incorrect address. This is usually fine, since the OP Code space is so empty that the data will likely be passed one at a time until the next valid instruction. Data is read through the standard data retrieval system (which is handy since its design is so universal and easy to add to) making this architecture a Von Neumann architecture as opposed to a Harvard architecture, like my previous, worse, computer. The MAR always specifies the address being read to or written from, whilst the MDR always holds the data being written. Data from the data out bus can be written to any special register during the instruction. OP Codes and operands are all 16-bits, which is a bit wasteful in terms of OP Code usage, however it was easier to implement this way, and so that is what I went with (and there are a lot of ALU processes).
Memory Mapping:
The 16-bit address space of the Femto-4 is memory mapped, with all data being stored somewhere in the address space. The last 48kx16b of memory (all addresses starting with 01, 10, or 11) are dedicated to the cart memory. This is where the interchangeable program would be stored, allowing programs to be easily changed by changing carts. The carts have 32 16kx16b EEPROM/RAM chips, which can be switched between during execution by writing to address 00cc. This gives each cart 512kx16b of memory to play with. In theory, additional memory can be added in a cart by creating a similar system on the inside of the cart, which would allow it to swap between even more EEPROM/RAM chips. The initial 16kx16b are therefore mapped to everything else, including a fixed "work" RAM chip that cannot be switched out, the bootloader, the PPU data, general use registers, the, stack, inputs, outputs, and special use registers.
"Fast Execution":
Execution at the fastest clock speed (one pulse every 100ms, or 10Hz, which is defined as the clock changing state every 50ms, or at a rate of 20Hz) is terribly slow, and would make reasonable graphics effectively impossible. Due to this, the Femto-4 includes several execution modes that allow the computer to run much faster. There are two registers involved in this, address 00ca, the mode register, and address 00cb, the protection register. When the two least significant bits of the mode register are low, the computer runs normally, executing 1 instruction per clock pulse. When it is set high however, the computer enters fast execution on the rising edge, where it executes multiple instructions per clock pulse. This is achieved by looping a rising edge monostable circuit into a falling edge monostable circuit, producing a loop that will pulse indefinitely until the looping line is written high to by some external factor. Stopping the loop is critical since leaving the loop running will stop CircuitVerse's execution, due to it going over the stack limit of the execution. "Fast execution" is always paused by a 0x0000 OP Code, which ensures that the computer will not attempt to "fast execute" memory that has not been written to. It is also paused by the OP Code 0x0001. Setting the 3 bit of the mode register high will enable protection. This will ensure that computer only executes as many instructions as the value in the protection register. This protects execution by ensuring that the loop will always pause before the cycle limit is reached. Since some operations are far more complex than other operations, the maximum number of instructions per clock pulse is variable, and testing should always be conducted to ensure that the limit is not reached. Due to this, for games that need regular graphics updates, it is recommended that protection is not used, and instead the pauses are fully code controlled. Setting the 2nd bit of the mode register high will enable the clock to run fast execution on the falling edge of the clock as well, doubling execution speed. On the other end of the mode register are the graphics mode. The highest two bits give the graphics update mode, 00 for falling edge only (normal speed), 01 for dual edge (double speed), 10 for every other clock pulse (half speed), and 11 for code controlled, where the 0x0001 OP Code is required to update the graphics. The third most significant bit is the graphics disable bit. Setting it high stops updating the graphics, reducing lag by reducing the number of changing outputs. The mode and protection values are only updated on the rising edge of the clock pulse, and therefore there should always be pauses before and after any execution mode or protection change.
Graphics:
The Femto-4 is capable of driving a 16x16 15bit direct colour screen. It has space for 32 "sprites" which are rectangles with an assigned colour. All the sprites are drawn to the screen whenever a graphics update occurs, depending on the graphics mode. When using dual-edge "Faster Execution", the falling edge should only be used to execute game code, since writing graphics data as the screen is being drawn may mess up the graphics. These 32 "sprites" have their data stored in the PPU RAM in the following format: The first 16 bits are the corners of the rectangle, with each coordinate being 4 bits. The coordinates are ordered x1 x2 y1 y2. The next 16 bits are the sprites colour, with the first 15 bits being used for 15 bit direct colour, and the last bit being used to enable or disable drawing the sprite. Since the screen is not wiped every time it is refreshed, the background must be a sprite to ensure that the screen is fully wiped before the rest of the sprites are drawn on. Control of this allows carts to draw a single frame over multiple updates, allowing the 32-sprite limit to be bypassed (see how Snake works). The "sprites" are drawn in memory order, with the "sprite" with the largest address always being drawn last and therefore on top, of all other "sprites". This is achieved by using the exact same monostable clock system as "Fast Execution", which reads off all the sprite data and draws them to the screen in a single clock pulse. This can loop more times safely than the main CPU since it has less dependencies which dramatically decreases the simulation's stack usage.
ALU:
The basic ALU was inspired by the ALU-74LS181. It was designed to flexibly change between various operations by changing an additional piece of data which is bundled in the OP Code. This allows a single ALU to handle all the required processes, such as the basic binary logic operations, shift left, adding, and subtracting. This is unlike my previous computer which had different chips for each operation it could do. The Femto-4 also can multiply, divide, shift right, shift left/right by a specified number of bits, and perform operations designed to work with the Femto-4's graphics data.
Conditional Jumps:
The Femto-4 can perform immediate and direct jumps depending on the flags, a specified bit of the accumulator, and the clock. The flag jumps allow for comparisons to be made. There are three flags, the carry, the most significant bit in the accumulator, and if the accumulator value is 0, the equals flag. By performing A-B, we can compare A and B by looking at the flags. If the equals flag is true, then A=B, since A-B = 0. If the most significant bit is 0, then the number is positive or 0 (by two's complement) and therefore A>=B. The comparison is not entirely correct for numbers in two's complement (a large positive number and a large negative number when subtracted can yield a positive number), but for small values it works well. Whilst we cannot directly check A<=B using A-B in this design, we can simply flip the subtraction to B-A to do so.
The accumulator bit testing is mainly used to check for controller inputs. Since each button in the controller is mapped to one bit, bit testing that bit effectively allows us to check if a button has been pressed. In theory a similar test could be performed using an AND instruction, and checking if the result is equal to 0 or not.
The jump on clock is there to ensure that we can jump execution on the right clock pulse, which ensures that graphics can be updated on the edge of execution.
Timing:
This computer is timed using several standard delay chips. The pulse length running in to the computer is about 10k units long. Therefore, different parts an instruction are separated by 20k unit delays. Further control of timings inside these periods is achieved through 1k "On Delays", which have a 1k delay turning on, but a 0k delay turning off, ensuring that pulses do not bleed into the next pulse. These pulses can tell registers to write and what source to write from, enable the read and write lines, update the ALU, and update the stack. For more information on how delay works see here: https://circuitverse.org/users/4699/projects/circuitverse-delay-introduction.
Keyboard Mapping:
The Femto-4's keyboard controller mapping was created using a specialised chip. This chip used the fast execution loop to take 15 inputs from a keyboard and map the inputs to button presses on the controllers. Since the buttons are updated several times in a clock pulse, the keyboard controller cannot handle held buttons particularly well. After 15 inputs are read, the buffer is cleared to prevent future input lag. The keyboard mapping is designed to work with both controllers, allowing two player games to be feasible on the computer.
Other Notes:
The memory wrappers allow external chips to interact with the main data control system, in this case used for RNG, controllers, the keyboard, and driving the text output. This makes it easy to additional chips to the computer.
For more information, please read the developer guide found in the Femto-4's Assembler, or just post a comment and ask me.
This is a secret to everybody, unless you found it.
Created: Aug 05, 2021
Updated: Aug 26, 2023
Comments