WiiSX/CubeSX Progress & Beta 1

As you may recall, WiiSX was put aside for us to all focus on Wii64. However, I picked it up again over the past weekend and made a few improvements along with a bit of help from tehpola. I ported over the pcsx-df code which is an updated and less-buggy pcsx codebase which instantly gave us great compatibility results in the Interpreter.

However, I then got the PowerPC recompiler (Originally from here coded by Gil Pedersen) working. It didn’t take much to get it into a operational state, just making it not mess with the GPR r13 to comply with EABI standard got it working, as well as some other small tweaks here and there. I also added some more functionality to it (recompiled all loads) and gave it 8MB of memory to keep recompiled code in (this is 7MB only for GC), but will gradually get bigger as I move things to MEM2/ARAM. There’s quite a bit to be improved within the dynarec, and quite a bit to be coded still, but I’m quite happy with its performance all things considered.

The GUI is non-existant apart from two options and a text based file browser. You can choose between the Dynarec or Interpreter and a Standard Controller or a Analog Controller. The analog controller is broken in some games, and so is the Dynarec (Final Fantasy VII battles).

Of course there is quite a bit to do, such as XA audio, CDDA audio, a GUI, Save states, .iso support, and much more, but this project will go on the sidelines again until the Wii64 beta is released. We hope to see positive some contributions to the code in the meanwhile whilst we work on Wii64 and not just controller config hacks :)

The full source (now updated with my latest changes) is available at http://code.google.com/p/pcsxgc/.

You may download CubeSX and WiiSX beta files from the pcsxgc GoogleCode site.

Instructions are provided in the readme.txt file in each respective archive.

Progress Report: Wii64 Dynarec (Part 1)

In the past few months, we’ve made significant progress on the Wii64 dynarec.  Most of the bug fixes are pretty minor fixes like correcting off-by-one or other various memory errors; however, there are several substantial changes to both the infrastructure and features of the dynarec.

On the N64, there is a register called Count which keeps track of how many cycles the system has been running.  This is primarily used to determined when interrupts can be taken.  In Mupen64, Count is estimated as 2 cycles per instruction executed.  Some emulators actually increment Count differently depending on which instruction ran (because on the hardware, some instructions will take longer to execute).  The fact that Mupen was doing really well with the Count estimate led me to believe that getting an exact Count was unnecessary, and I initially tried playing some tricks to estimate without explicitly keeping track of Count.  However, I quickly discovered that even deviating from the way Mupen counts will quickly result in crashes and freezes.  Several major fixes have involved correcting edge-cases which caused Count to be somewhat off.

Initially only 32-bit integer instructions were supported in the dynarec (they comprise most of the ISA, and I just wanted to get something working before I tried anything too complicated).  Once I got the dynarec running with just those basic instructions, it was still fairly slow because a lot of instructions were still being interpreted (thus trumping any performance benefits of the dynarec).  Getting the floating-point and 64-bit instructions (which aren’t used all that often as the name N64 would lead you to believe) supported in the dynarec were important for improving the dynarec performance beyond that of the pure interpreter.

With the exception of the way floating-point comparisons and conversions are done in MIPS vs PPC and MIPS’s sqrt, floating-point was fairly straightforward to implement in the dynarec as most instructions had a 1-1 mapping.  Even the comparisons were relatively simple although they do not take advantage of what I feel is a more rich FP comparison on the PPC.  However, since the Wii does not have a floating-point square root instruction, it was difficult to support the MIPS sqrt instruction in only a few instructions.  We did manage to get it working with what seems to be good-enough precision using the PPC frsqrte (floating reciprocal sqrt estimate), Newton-Raphson refinement, and a fmul.  The only floating-point instructions left to support are conversions to and from 64-bit integers which are nearly impossible to generate code for because there is no hardware support on the Wii and the process is rather complex.

64-bit instructions were a similar story: most of the instructions had a straightforward translation from MIPS to PPC (even though the PPC in the Wii is 32-bit), but there were a few which were difficult to emulate.  The simple addition, subtraction, and logical instructions were very simple: you simply need to use two PPC registers to store a 64-bit value and there are instructions which will keep track of and use the carry bit so that a 64-bit add/sub can be performed in two 32-bit add/sub.  The 64-bit shifts were relatively complicated because you have shift both 32-bit words separately, and then determine what would have spilled from one into the other and or it into that word, but it can be done in around 10 instructions in PPC.  Like with FP, there were a few 64-bit instructions that we couldn’t reasonably generate code for: the 64-bit multiply and divide are too complicated for generating code using only 32-bit operations.

However, even with most of the ISA implemented, there was still significant room for improvement in performance.  I have since made some other significant improvements which I will be detailing in more posts to come soon.

No news is good news?

We don’t really appreciate the negative comments on the site. It is obvious that there is a lot of anticipation for homebrew N64 emulation on the Wii/GC. Believe us when we say that we are equally excited to see Wii64 enter public beta stage. We have devoted a lot of our free time to this emulator for nearly two years, which is since the days before Wii homebrew was even possible. Many of you who are complaining are simply being impatient because you want to try something that you would only complain about more until we have worked more bugs out.

We have entered a private alpha stage, and we have made many improvements/bugfixes to the emulator in the last few weeks. However, there is still a substantial amount of work to do, and a couple of us are very busy with schoolwork right now. We won’t give any release dates, but we hope to be able to release a public beta as soon as possible. Please don’t ask for a release date or to beta test.

April 1st Tiizer is Real & General Update

First off, the April 1st Tiizer video is actual gameplay using a recent dev build of Wii64. As you can tell, tehpola has made tremendous progress in debugging and optimizing the dynamic recompiling core. However, there are still a handful of showstopping bugs that we need to work through before we can make a public release. Also, you should be aware that not all of your favorite games will run on the initial release because of a variety of reasons. We are not planning on initially supporting the Expansion Pak because of memory limitations. After further optimizations, tweaks, and profiling to reduce our memory consumption, then we hope to add Expansion Pak support. We may not initially support games that execute code directly from the cart or that use virtual memory (i.e. Goldeneye) because this requires more investigation and significant code changes in the dynarec to implement. Also, some graphics microcodes aren’t supported in glN64, so a few games such as Conkers BFD won’t work just yet. But, sit tight and we’ll continue to work on more features for Wii64 after the initial release.

A complete re-code of the Wii64 gui is underway, so you’ll be able to enjoy using the wii-mote for navigation and also some sleek new graphics. We’ll have a new look for the initial release, but we also plan on adding more features to the gui over time for your enjoyment.

If you have watched any of the recent gameplay videos, then you know that the accuracy of the glN64 port has increased substantially since the Wii64 Tiizer release we made for the Homebrew Channel. Because GX is not 1:1 with openGL, there was a lot of investigation and tweaking required for me to get the behavior on GC/Wii close to what glN64 looks like on PC. There are still a variety of bugs for different games, so don’t expect everything to look perfect, yet. Emu_kidid is a great tester, and he is maintaining an internal graphical issue list to work on. I hope to add a couple more features to glN64 prior to release, including glN64′s primitive framebuffer texture support as well as 2xSaI scaling for textures. The plan is, of course, to continue hunting down bugs and adding features after the upcoming release.

As for the other graphics plugins, glN64_GX is much faster than both soft_gfx and GX_gfx, so we may only release a build with glN64_GX. The only drawback is that currently glN64_GX won’t render graphics for demos that only directly manipulate the framebuffer with the CPU. However, when I have time I’ll add a feature into glN64_GX that will allow it to render the N64′s framebuffer rather than rendering primitives passed through the N64′s graphics pipeline. Then, you can just flip an option in the menu when you are running homebrew N64 games and demos that write directly to the framebuffer. Also, I have already done some work on porting Rice’s video plugin to Wii64. Rice supports more microcodes than glN64, including the one that Conkers BFD uses, and it should be faster than glN64. We have a vision of supporting custom texture packs in Wii64, so we will implement that feature as well. We hope that you, our users, will contribute your creative talents in developing texture packs to share with the Wii64 community. We can’t say when custom texture pack support will be finished, but expect it sometime in the future.

Some of you have been asking for an update on WiiSX. We are planning on working on a release of WiiSX after the upcoming Wii64 release. The reason we have not done a release yet is because there were some serious bugs in SVN last fall, and we also wanted to focus on completing Wii64. We have since resolved some WiiSX issues, internally, and so once Wii64 is out the door, we feel that we can also follow up with a WiiSX release relatively soon afterwards.

Finally, we’d continue to ask that if you enjoy using Wii64 when it’s out that you consider donating to the project. Right now, most of the donations we receive go toward hosting costs. However, there are also some small accessories like component cables and classic controllers that we are considering purchasing with donation funds to aid in development.

Status Update: Future Releases

As its been a while since our last binary release, we wanted to clarify why its been so long and what we’re waiting for for our next release.  Early on in development we were making relatively big changes which significantly improved the emulator; however, we’ve gotten to the point where a lot of the big things have been done, and only need perfecting (with the exception of the dynarec).  Thus, we haven’t felt the need to make several binary releases as most of the users who aren’t interested in compiling the source themselves are mostly uninterested in the kinds of changes that have been made.  We do indeed have a milestone for our next release planned: a working, stable dynarec.  Most of the work that has gone into the emulator since our last release has been focusing on the dynarec, and since we still don’t have a completely working dynarec, there haven’t been many noticeable changes.  So we’re holding out for a dynarec which supports at least most games without crashing before we make our next release.  After getting it running initially, there will likely be more room for optimization if there are still any performance issues.  In that case, we will likely have frequent releases once again as there will be noticeable improvements with each optimization that is made.  As always, please be patient.  We’re working hard to make the next release something worth the wait.

On an unrelated technical note, we have managed to free up 1.75MB in RAM by consolidating the various memory LUTs (look-up tables) into a single LUT for all memory operations.  In Mupen64, there are 8 different memory LUTs which are used to determine how to handle memory accesses at different addresses.  These 8 are split up by read/write byte/half-word/word/double.  Instead of having 8 large LUTs, I created one LUT for all memory operations which points to smaller LUTs which handle the different memory operations in the specified segment.  Memory operations only require an additional load for the second level LUT so there is no performance impact by this change.  We are still looking into other ways to further reduce our memory usage to make sure that we have plenty of room in memory for recompiled code produced by the dynarec.

Status update: 4th Jan ’09

Just a tiny status update about the progress within the last few weeks to bring us into 2009.

I finally realized what was going wrong with a couple of games, we weren’t using 16Kbit EEPROM type saves for certain games that require it. A fix should make its way to the SVN in the next few days for this. Notable games that are now booting thanks to this fix are: Banjo Tooie, ExciteBike 64, Yoshi’s Story.

As for some news on the dynarec front, I coded up a few rough demos and a bug a two was revealed to us from these demos. Also, tehpola has been working hard at debugging the emulator on his PS3 to correct & find solutions to the bugs we’re finding (see article below to see why it’s done on PS3). Every little bit gets us one step closer to the finished recompiler.

Also, the soft graphics plugin has been updated to render the framebuffer through hardware to aid us in debugging (with a faster framerate).

Seasons Greetings.

The State of: Wii64 Dynarec

Since this is my first post on the blog discussing the dynarec, I’d like to first explain what a dynarec is and why we’re going to need one to accomplish full speed emulation on the Wii.  Then I’d like to describe the history of the dynarec in our emulator, where its at now, and what needs to be done to get it working.

First of all, dynarec stands for dynamic recompiler, which is actually a bit of misnomer in the console emulation world: usually its not accomplished by creating an abstract syntax tree or control flow graph from the emulated machine code and running a target machine code compiler over it, which is what recompilation would really entail.  The proper term would be binary translation: for each emulated instruction, I convert it to an equivalent target instruction.  Since the N64 is a MIPS architecture machine, I take a MIPS instruction, decode it (determine what kind of instruction it is and what operands it operates on), and then generate equivalent PowerPC (GC/Wii use PPC) instructions to perform the operation that the MIPS instruction intends to.  What we try to do is take a block of code that needs to be run, and fill out a new block with PowerPC code that was created by converting each of the MIPS instructions in the block.  The emulator then runs the block of code as a function: it will return when a different section of code needs to run and the process repeats for the next block of code.

What we’re doing now is running an interpreter: instead of translating the MIPS code we want to run, we just decode each instruction and run a function written in C which performs what the MIPS instruction would do.  Though this may seem like less work: we don’t have to translate all the code and then run it; we just run it, but because the code is ran so many times and running the translated code is much faster than running each instruction through the interpreter, the extra time translating is made up for my the faster time running through long loops.

The dynarec was the first thing I started working on with the emulator: it seemed like the most interesting aspect and the most crucial for such a port (besides the graphics which I didn’t understand well enough at the time to do much useful work besides porting a software renderer).  It’s gone through a few different stages different stages: 1-to-1 register mapping binary translator, quickly dropped attempt at reworking the translator to be object oriented, slightly further progressed attempt at a MIPS to Scheme translator, and where I’m currently at: the first binary translator without 1-1 register mapping, confirming to the EABI (Embedded Application-Binary Interface).

I was concerned about performance initially, and I got a little greedy: I decided that since both MIPS and PowerPC had 32 general purpose registers, and MIPS has one hardwired to 0, and PowerPC has an extra register (ctr) I could move values into for temporary storage, I could do a simple translation of most of the instructions by using all the same registers as the MIPS would use on the PPC.  The idea was that I wouldn’t have to shuffle things in and out of registers; I would load the whole MIPS register set values into the PPC registers, run the recompiled code which would operate on those values, and then when its done with a block, store those values back and restore the emulator’s registers.  This was a bad idea for several reasons: small blocks that only fiddled with one or two registers still had every single register stored, loaded, and then stored and loaded again for each block, I had to disable interrupts because I destroyed the stack and environment pointers that were expected if any interrupts were taken, and because I couldn’t take interrupts, it was very difficult to debug because I couldn’t run gdb in the recompiled code.  I had developed a pretty large code base and a somewhat working recompiler before I truly came to realize all the drawbacks of the method: it ran some simple hand-crafted demos I had written in MIPS which computed factorial and a few other simple things, but overall it was too unweildy and inefficient to continue to debug.

My attempt at refactoring the code I had written in a OOP way was soon abandoned, but it did inspire some improvement to the way I generated instructions.  Instead of piecing together the machine code from all the different parts, I wrote new macros which would do that for me for specific instructions thus reducing some major code clutter in the translator functions.

I was unimpressed by the improvements I predicted I would see by refactoring the code in C++, and inspired by Daeken’s work on IronBabel to start the dynarec from scratch using a high-level language.  The idea and the code was much simpler: decode the instructions using high-level magic and instead of generating low-level machine code, generate high-level code to execute each instruction, collect all the code together, and run it as a function for each basic block.  I chose Scheme because how easy it is to generate and run code on the fly (since in Lisp, code and data are only differentiated by how they’re used).  The recompiler was a breeze to write, but interfacing with the C code proved troublesome.  Although I eventually got the code to run, I ran into issues with the unlimited precision numbers in MzScheme, and my other choice of Scheme, Tiny Scheme, didn’t support some bitwise operations and I never got around to adding them.

Finally, I decided to go back to the old code base and improve on it with respect to the issues I had discovered along the way.  I wrote more macros to clean up the code generation, I did away with 1-1 register mappings, and worked on compliance with the EABI so that I wouldn’t have any issues with interrupts and calling the recompiled code as a C function.  Now, instead of loading and saving 31 registers for each dynarec block, I load each register as its used, and I store their values at the end of the block or if I used up the alloted registers for storing MIPS registers (I use volatile registers so I don’t have to worrying about saving their values).  It’s not much more complicated to translate the instructions with the new mappings because for each block, the mappings are static and are kept track of while recompiling so I simply build up a table of mappings while recompiling which I flush at the end of each block.  EABI compliance was a matter of creating a proper stack frame for the recompiled code, and not touching certain registers; since I have a few special values (base address of MIPS registers in memory, address of interpreter function, zero, and a running count of instructions) that I need to be maintained to any other calls, I needed to save those registers on the stack in the proper locations and restore them when I returned to the emulator.  EABI compliance allows me to leave interrupts enabled while the recompiled code is running (in general, leaving interrupts disabled for extended periods of time is a bad idea) and allows me to step through recompiled code in gdb which greatly improves my ability to debug the dynarec.

The new format allowed me to debug things much easier: I could much more easily compare the original code and the effects of the recompiled code by stepping through.  Soon after the reworked dynarec was completed, I pushed through all the obvious bugs in the apploader (I had some issues with the calculation of the checksum of the ROM and invalidating recompiled code that was overwritten with new code).  Now the dynarec executes the standard apploader successfully, and begins running the code unique to each game.  However, I still haven’t seen anything much happen after that point as far as any graphics showing up or anything like that except for in a demo I wrote that blits an image to the screen after running some unit tests.

As I’ve recently purchased a PS3 and installed Linux on it, I have a full environment to test the recompiler under without the hassle of running on the Wii.  I’ve already made a quick port of my dynarec to run under PPC Linux, and I believe its breaking at the same points it was on the Wii.  Running in a full OS gives me access to more tools such as valgrind and better support in gdb which helps improve the rate at which I can narrow down and fix bugs especially as the progress further into the execution of the games.

Barring some issues dealing with interrupts and exceptions, I believe the dynarec is feature-complete at this point and there are some lingering bugs (possibly dealing with some instructions which I haven’t previously seen in action or some edge cases dealing with translation or execution) which need to be resolved in order to get the recompiler working.  There are a few instructions not recompiled which I intend to support after I have the basic integer instructions working: floating point instruction, 64-bit instructions, and loads/stores from/to the stack which will hopefully improve the performance of the dynarec once its running.  Of course, finding these bugs take time, and its hard to put any kind of ETA on finding and fixing them because I don’t know how many issues are lurking behind the one I’m currently stuck on, and its not always easy to track down the source of the issue so please be patient as we work to resolve these issues as its hard to get this all right.  However, I believe that with the dynarec running and the hardware accelerated graphics we have now, we can accomplish smooth, full speed emulation of most titles, and possibly even support some extras like high-resolution textures.  As things progress, I hope to keep everyone informed of how things are going, so look for more posts on this topic later on.  In the mean time, emu_kidid has made a video demonstrating the emulator in its current state on the GC so check it out.

New Site

Hey everyone, we’d like to welcome you to our new site.  The idea behind this site is to have a central location for information about our project in hopes that we can reduce rampant speculation, and to hopefully avoid repeating our answers to frequently asked questions, etc.  The three of us, sepp256, emu_kidid, and I (tehpola), are often too busy with real life and doing actual work on our projects (mupen64gc and pscxgc and more) that we don’t always have time to answer everyone’s questions.  We’ve created this site in hopes of improving our communicates with the public: we’d like to be more open about what our goals our, what we’re working on, how we’re doing things, and we would like the information collected mostly in a central location.

We’d like to begin by thanking the people who donated to help us get the site running:

  • Ninth Sage
  • iSynic
  • Jason
  • And many others (if you weren’t listed and would like to be, send us an email)

We’d also like to thank the great guys at OzModChips for donating a USBGecko to help us with our debugging needs.

We’ve already have a page dedicated to the goals for mupen64gc in great detail and we’re in the progress of developing an extensive compatibility list for it which will give information on how various titles perform.  So check out our site and let us know how we can improve it.