USM

From Amicitia Wiki
USM Format
Purpose Pre-Rendered Video
Games Persona 4 Golden
Persona 5
Catherine Full Body
Persona Q2
Persona 5 Royal
Yakuza 0
Yakuza Kiwami
Developer CRI Middleware

USM is a video format commonly used by the PS3 and some other consoles. In Persona 5, USM files contain 2D anime cutscenes, pre-rendered 3D cutscenes, and various videos displayed on in-game objects like TV screens, list here: Movie (P5).

Tools

Scaleform Tool

In order to be able to create an USM file you need to use Scaleform Tool. It requires an AVI file as input and it needs to match the resolution of the USM found in the game you are modding or else it will glitch.

VGMToolbox

VGMToolbox is a free C# based tool designed for VGM (video game music) collectors and dumpers. However, it also contains tools for extracting and converting USM video to usable files. You can get VGMToolbox here.

To extract USM files in VGMToolbox, go to:

Misc. Tools > Stream Tools > Video Demultiplexer

And select "USM (CRI Movie 2)" in the Format box. Then simply drag and drop your USM files you wish VGMToolbox and they will automatically begin to be extracted. The final result will be two separate file types:

  • M2V (which only contains video).
  • ADX (which only contains audio).

In some cases, M2V files may appear pixelated, though this can be corrected with FFMpeg. In order to convert ADX to usable audio or combine the video and audio files, you will need the other tools listed below.

ChipAmp (Winamp Plugin)

ChipAmp is a free plugin bundle for the Winamp media player that can play and convert audio files from various game consoles. It is needed to convert the ADX audio files contained within USM files to usable audio. You can get the ChipAmp plugin bundle here. If you do not already have WinAmp installed, you can get that here.

With both Winamp and ChipAmp installed, open your ADX file in Winamp. Go to your Winamp preferences

(Options > Preferences or Ctrl+P), scroll down to Plug-Ins > Output

Select Nullsoft Disk Writer, and click Configure at the bottom. Set up your export preferences as you wish (make sure to set Output File Type to "Force WAV file"), and click Okay. When finished, click Play in Winamp and your ADX file will be converted to a WAV file.

FFmpeg

FFmpeg is a free suite of libraries and programs for handling video, audio, and other multimedia files and streams. It can depixelate M2V video from USM files, and can also recombine M2V video and WAV audio into a single MKV file. You can get it here.

To recombine M2V video and WAV audio, open a commandline window and run cd (ffmpeg_location)\bin, Then run the following command to merge the files: ffmpeg.exe -i (video_file).m2v -i (audio_file).wav (output_filename).mkv

Hit Enter. When it's finished, your merged video will be in FFmpeg's bin folder.

CRIDusmDemuxTool

CRIDusmDemuxTool is a command line program that allows USMs from Persona 5 The Royal to be extracted. Audio is extracted fine but the M2V video may appear to be corrupted. If that occurs, convert the M2V to another video format, preferable MP4. You can get it here.

Metadata Overview

Persona 4 Golden

PS Vita (US) - MP4 PC v1.0 [REV 2008] - WMV PC v1.1 [REV 2033] - WMV ("Normal" Quality) PC v1.1 [REV 2033] - WMV ("Low" Quality) PC 64-bit [REV 4509] - USM
Video Codec H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 wmv2 (Windows Media Video 8) vc1 (SMPTE VC-1) vc1 (SMPTE VC-1) vp9
Frame Rate 29.97 15.00 29.97 29.97 29.97
Width 960 1920 1920 960 1920
Height 544 1080 1080 540 1088
Display Aspect Ratio 30:17 16:9 16:9 16:9 16:9
Pixel Format yuv420p yuv420p yuv420p yuv420p
Audio Codec AAC (Advanced Audio Coding) wmav2 (Windows Media Audio 2) wmapro (Windows Media Audio 9 Professional) wmapro (Windows Media Audio 9 Professional) hca (Criware)
Audio Bit Rate 127kbps 128kbps 128kbps 128kbps 192kbps
Audio Sample Rate 48.000 kHz 48.000 kHz 48.000 kHz 48.000 kHz 48.000 kHz
Audio Channels 2 (Stereo) 2 (Stereo) 2 (Stereo) 2 (Stereo) 2 (Stereo)
Stream Index #0 = video, #1 = audio #0 = video, #1 = audio (JP), #2 = audio (ENG) #0 = audio (JP), #1 = audio (ENG), #2 = video #0 = audio (JP), #1 = audio (ENG), #2 = video
Notes US MP4s must be decrypted before they can be played locally. JP USMs must also be demultiplexed. File extension for videos on PC are .USM but do not need to be demultiplexed. Simply rename the extension to .WMV to play locally. File extension for videos on PC are .USM but do not need to be demultiplexed. Simply rename the extension to .WMV to play locally. Rename the extension to .WMV to play locally. Appear to be direct re-encodes of the Vita files as indicated by disclaimer text for p4ct001. File extension for videos are .USM and need to be demultiplexed to play locally.

If necessary, you can collect a more detailed breakdown of metadata with tools like "Metadata2Go".

Modding Tips

Persona 4 Golden (PC)

Stream Index

Pay close attention to the changes between v1.0 and v1.1 in the metadata table above. The stream index of custom USM/WMV files must be configured appropriately to play correctly in-game. Even opening and end credits videos (e.g. p4ctop3, p4ct017) have two audio streams despite the tracks being identical. Be sure to follow this convention to avoid issues.

FFmpeg's documentation on the "Map" command is essential reading for anyone looking to rearrange streams to ensure their custom video plays correctly. You can find this here: https://trac.ffmpeg.org/wiki/Map

Practical Example

As a simple example, let's imagine we want to encode a 1080p MP4 with FFmpeg to work in-game.

  • First, identify the streams with ffprobe:

ffprobe -v error -show_entries stream=index,codec_name,codec_type "INPUT.mp4"

For this example, we can see that the streams of our source file are indexed as follows:

index 0 = aac (audio)

index 1 = h264 (video)

However, v1.1 (REV 2033) USM files should be formatted as follows:

index 0 = wmapro (japanese audio)

index 1 = wmapro (english audio)

index 2 = vc1 (video)

  • To encode our exttample mp4 to meet these specifications we can use the following command:

ffmpeg -i "INPUT.mp4" -qscale 1 -vcodec wmv2 -acodec wmav2 -map 0:0 -map 0:0 -map 0:1 "OUTPUT.wmv"

As you can see, we've decided to use the wmv2 and wmav2 codecs in this case, instead of vc1/wmapro. In my experience, FFmpeg doesn't handle wmv3/vc1 well and wmv2 will work just fine in P4G so we'll use that. Once we've encoded the WMV, we just need to manually rename the extension to USM and it should work fine in-game.

Resolution

Custom USM/WMV files should match the native resolution of the files you are replacing. Unless you're working with the "low quality" files introduced in v1.1, this will be 1920x1080. Custom files with a resolution greater or less than this will not scale correctly in-game.

Encryption

Persona 5 Royal

All USM files found in Persona 5 Royal are encrypted with a special key.
To decrypt them you need to use crid_mod by nyaga/bnnm:
crid_mod.exe -b [first half of decryption key in hex] -a [second half of decryption key in hex] -v -x -i [path to usm]

The decryption keys are region-specific. Steam has its own keys, which are the same as its ADX audio decryption keys.

Region Dec. HEX
JP 0 10911089 00000000 00A67D71
CN 0 29915170 00000000 01C87822
EFIGS 0 56321924 00000000 035B6784
EFIGS (Steam) 2310504 1026546598 00234168 3D2FDBA6

Garbled Audio

ADX and Wav files extracted this way may have garbled, high pitched or just plain horrible audio. This is due to the USM file containing multiple, layered audio streams.

  • Extracting the files this way layers them on top of each other creating this effect. To fix this we can separate the audio streams.
  • Use the correct Hexkey for your version of the game.
  • Dec. (Decimal) keys are only used to decode audio from .awb/.abc and raw .adx files. Not .USM audio files. We won't be needing the Dec. keys for .USM


This can be done using the -s command followed by the audio stream chno ID.
crid_mod.exe -b first half of hex key -a second half of hex key -v -x -i -s [ID] [path to usm]

Audio streams are typically separated by chno ID's -1 , 0 and 1

  • Some USM files use all 3. i.e. 3 audio streams, or only 1 or 2.

Practical example

You can identify which audio track you'd like to extract by using the -i command.

  • crid_mod.exe -b first half of hex key -a second half of hex key -i [path to usm]

This will generate an .ini info file about the USM which can be opened with Notepad.


In it you will find multiple headers. We are only concerned with the [CRIUSF_DIR_STREAM] headers.

Most important are the:

  • filename =
  • chno =

The filename will help you figure out which audio file is part of which stream.

  • i.e. filename = ep29_1.wav
  • This can be also used to identify and extract Japanese audio from the same USM i.e. filename = ep29_1_JP.wav.
  • [If multiple languages occupy the same chno id, they all use the same audio]


The chno ID identifies the stream channel of the audio file.

  • Please note, multiple files can occupy the same chno ID. This is normal and not the issue.
  • If multiple streams are layered and creating audio issues, extract each using the following -s commands:
chno ID command Description
chno = -1 crid_mod.exe -b first half of hex key -a second half of hex key -v -x -i -s -1 [path to usm] Typically used to store the .USM file. Not very useful to extract most of the time.
chno = 0 crid_mod.exe -b first half of hex key -a second half of hex key -v -x -i -s 0 [path to usm] Typically contains the main language sound files. Extract this.
chno = 1 crid_mod.exe -b first half of hex key -a second half of hex key -v -x -i -s 1 [path to usm] Is used to layer a third stream. This may contain additional sound files or languages [eg JP]

This will separate each stream into a different adx file. You can also, additionally, convert them to .wav using the -c command.

  • e.g: crid_mod.exe -b first half of hex key -a second half of hex key -v -x -i -s 0 -c [path to usm]

The .mk2 video file and .adx / .wav files can now be joined with ffmpeg.

Please note. Pre-rendered cutscenes do not always contain Voiceover lines.

Certain USM files only store music and soft, reference voice audio that was used to sync the cutscene in multiple languages.

The VOs for these files are layered in real time onto the cutscene in-game, and are stored elsewhere. Probably in the character AWB files.

Persona 5 Royal Decryption Tutorial

Software needed:

Please Note: During this example we will be using the EFIGS (US/EU PS4) version's Hex key. If you have the JP, CN or Steam version of the game. Use those Hex keys.

  • The Dec. (Decimal) keys are only used to decode audio from .awb/.abc and raw .adx files. Not .USM audio files. We won't be needing the Dec. keys for this tutorial.

We will also be using mov000.usm (Opening video) located in the ps4_movieR.cpk as an example during this tutorial.

Crid_Mod

  • Crid_mod.exe uses cmd for commands.

Lets start by opening cmd.

First, we need to point cmd to the folder Crid_Mod.exe is in. We can do that using this command.

cd "Location of Crid_Mod.exe" e.g. cd "C:\Users\Joker\Downloads\CRID.usmDemux_Tool_v1.02-mod"

Press enter. Now, before we start decrypting, here is a breakdown of Crid_Mod's commands

Command Use
-o

-n

-b

-a

-m

-x

-v

-i

-c

-s

(name) [internal name output folder]

(use internal names)

(key1) [upper 32b from 64b key]

(key2) [lower 32b from 64b key]

(file) [audio mask keyfile, size 0x20]

[demux audio]

[demux video]

[demux info]

[convert adx to wav instead of demuxing]

[audio stream chno id]

-b will always be the first half of the hex key (usually 00000000 for consoles), and -a will always be the second half.

If you have broken audio, see Gabled audio section on how to use -s

Now that you know what all the commands do, we can start decrypting.

crid_mod.exe -b 000000000 -a 035B6784 -v -x -s 0 "C:\Users\Joker\Downloads\CRID.usmDemux_Tool_v1.02-mod\mov000.usm"

Replace "C:\Users\Joker\Downloads\CRID.usmDemux_Tool_v1.02-mod\mov000.usm" with your .usm file location

  • You can just drag the .USM file into cmd to specify the path. It will auto set it.
  • You can use -i to generate an Info file on what video/audio is stored inside the USM. Useful for -s chno id. I used -s 0 here.
  • You can use -c to covert the .adx audio to .wav.

You will now have a decrypted .m2v video file and a .adx/.wav audio file. We can now use ffmpeg to join them.

ffmpeg

Now that you have both audio and video files we can point cmd to ffmpeg.exe.

  • Tip: If you copy all the ffmpeg.exes and Crid_Mod.exe to same folder you don't need to cd to a new folder.
cd "Location of ffmpeg.exe" e.g. cd "C:\Users\Joker\Downloads\ffmpeg"

Now, there are many ways to use ffmpeg to convert, encode, transcode etc. video.

  • However in this example, we will combine the audio and video and change the container to .mkv without converting for maximum, lossless quality!

We can do that with the following example command

ffmpeg.exe -i "Location of Video File.m2v" -i "Location of Audio File.wav/.adx" -c copy Outputfile.mkv

  • You can just drag the video and Audio file into cmd after the -i and it will auto set the location for you

Practical example

ffmpeg.exe -i "C:\Users\Joker\Downloads\ffmpeg\mov000.m2v" -i "C:\Users\Joker\Downloads\ffmpeg\mov000_0.adx.wav" -c copy Intro.mkv

  • If you have errors with the .adx file use -c in Crid_Mod to convert it to .wav
  • Note: Using -c to convert audio in Crid_Mod renames files to Filename.adx.wav. Its a .wav file.

The new .mkv video can now be opened in VLC.

Alternatively you can change the extension of the output file to e.g. -c copy Intro.mp4

Enjoy!