The Salty Bible
Get Salty's MusicBrainz Picard Script...
1. Organization
1.1. Use a meaningful and consistent library structure
Get Salty's MusicBrainz Picard Naming Script...
-
Including the date at the beginning allows sorting releases chronologically and helps identify releases at a glance.
-
Including the catalog number or barcode allows identifying releases at a glance. When both are available the catalog number is prioritized because it is easier to differentiate at a glance.
-
Grouping tracks into folders by disc subtitle makes it possible to have different covers for each disc in a box set without embedding.
-
Replacing disallowed characters with "_" ensures a folder or file with an empty name is not created, minimizes loss of information and loss of uncertainty when glancing at the files. The underscore character is a good replacement candidate due to it being easy to distinguish and type. Disallowed characters are not replaced with similar looking unicode variants because that will end up producing misleading output.
-
The "." character is disallowed because on Windows folders cannot end with it.
-
Audio quality is not included because its format differs by audio format and handling releases with multiple audio formats significantly increases the complexity of the format script.
-
Rip log score is not included because there are multiple definitions for scoring and computing a score significantly increases the complexity of the format script.
-
Releases are not grouped by genre as it is subjective and a release can belong to many genres.
-
Do not replace spaces with underscores as it makes the path hard to read and results in loss of information when disallowed characters are already replaced with "_".
-
Keeping more information in the path helps users find the releases when sharing on P2P platforms.
1.2. Lowercase all file extensions
In a sampling of 3.3 million files with a file extension around 1% had an uppercase extension. The files originated from all across a daily driven desktop installation of Windows. As 99% of the files had a lowercase extension, it is reasonable to consider it standard and expect that not all software will support uppercase extensions.
1.3. Generate a fresh AccurateRip log for all ingested releases with a rip log
AccurateRip logs (.accurip
) generated by CueTools are info files about the audio tracks of a CD rip. From the log it is possible to determine if the audio on disk matches other rips of the CD performed around the world. Both the CueTools and AccurateRip database are consulted for this info.
Only EAC rip logs contain data from both of the aforementioned sources while both XLD and Whipper only support AccurateRip, which also has fewer data entries than CueTools in general. In addition, as a considerable amount of shared CD rips are performed near their release date, the rip log may not have as trustworthy information as an AccurateRip log with the freshly fetched data. Furthermore, it is also possible to determine whether the audio files belong to the EAC, XLD or Whipper rip log included with the tracks.
1.4. Use proper physically identifiable names for scans
Sequential scan names like 1.png
, 2.png
, etc are confusing for people who are not physically familiar with the specific release or packaging type, same goes for owners coming back after having lost their physical copy.
-
To label scans of a series of items (e.g. inserts, booklet pages, booklets themselves, etc), add
N
to the scan name, where N is the number of the scan. However, if there is only one scan, there is no need to add the numerical suffix, since it is understood that there is only one item. -
If both the front and back of an item are included in separate files, add
Front
orBack
to the scan name. However, if only the front scan is present, there is no need to add the side suffix, since it is understood that the scan is of the primary visual asset, which is the front of the item. -
If both the front and back of a foldable item are included in a single file, add
Outside
to the scan name. The same principle can be applied to a scan of the inside of an item, whereInside
is added to the scan name. -
If there are multiple scans of the same surface, add descriptive terms (e.g.
with Sticker
,with Obi
, etc) to the scan name to differentiate them.
Possible scan names (excluding suffixes) and their physical counterparts:
Scan Name | Physical Counterpart |
---|---|
Back |
Back of a disc case or a record sleeve, sometimes the spine is also included in this scan |
Book Page |
Page of a book |
Book |
A set of pages that have been fastened together inside a cover to be read |
Booklet Page |
Page of a booklet |
Booklet |
Thin book with a small number of pages and a paper cover |
Card |
Piece of cardboard, including playing cards |
Disc |
Optical disc, use Matrix for the back of the disc |
Front |
Front of a disc case or a record sleeve |
Insert |
Piece of paper, often an announcement or advertisement |
Matrix |
Back of an optical disc |
Obi |
Piece of paper wrapped around the spine of Japanese CDs |
Postcard |
Postcard |
Quadfold |
Booklet that fols out four times |
Record |
Vinyl, shellac or acetate record, use Front and Back for side A and B |
Slipcase Spine |
Spine of a slipcase, same concept as spine |
Slipcase |
A box with an opening for the contents to slide out of, also has a Top and Bottom |
Spine |
Side strip of a case, often with brief info about the release and the only visible part when the case is stored in a shelf |
Sticker |
Adhesive piece of paper |
Tray |
Inside of a case with the tray and its liner, sometimes with Disc or with Discs |
Trifold |
Booklet that folded out three times |
1.5. Rename .cue
, .log
and .accurip
files to CD1
, CD2
, etc
Short and concise names make it easier to identify the file type and avoid having problems with the path length limit.
1.6. Delete .m3u
, .m3u8
, foo_dr.txt
and audiochecker.log
files
These files provide no useful information about a release.
1.7. Use short variants for media types with multiple file extensions
Use short file extensions for .jpg
and .tif
files.
1.8. Do not pick apart or merge releases
Releases must be considered inseparable. Deleting, adding or swapping tracks of a release compromises its integrity: makes it unverifiable as a whole and its album level metadata not match what is actually stored on disk.
2. Formats
2.1. Encode all lossless PCM audio using libFLAC with the highest compression level
FLAC is a free, open source, well documented and broadly supported lossless audio codec. The most recent version of libFLAC, which is the reference implementation for the codec, provides the highest compression ratio when compared to earlier versions of libFLAC and the FFMPEG encoder.
2.2. Do not encode lossless to lossy or lossy to lossless
Converting from lossless to lossy reduces the quality and undermines the verifiability of the release, while converting from lossy to lossless offers no benefit and is misleading. The only scenario in which converting from lossy to lossless is acceptable when the source uses a rare and generally unsupported lossy codec.
2.3. Do not resample audio files
Resampling audio is a lossy process. If it is not done with the correct settings it will often not match the same quality when downloaded from the original source.
2.4. Keep covers square if feasible
If a cover is within about 5% of 1.0 aspect ratio, crop it to be square if at all possible. Sometimes covers include erroneous white borders, which should not be counted towards the aspect ratio and fixed by cropping.
2.5. Keep the highest resolution cover available
Keeping high resolution album covers ensures that they will remain visually clear as monitor resolutions continue to increase.
2.6. Prefer 100%LOG CD rips over low quality WEB rips
A properly logged CD rip has more verification value than a 16bit 44kHz WEB rip. Additionally, it is not worth sacrificing storage space for hires WEB rips that do not make effective use of the available bit depth and spectrum limits or originate from a low quality source.
2.7. Avoid audio above 96kHz
The storage requirements at such high resolutions are too high to be practical.
2.8. Encode and optimize all lossless scans to PNG
PNG is a widely used and broadly supported lossless image codec. Keeping all scans in the same format makes it easier to manage them. However, because many encoders cannot fully optimize the produced image files, the images should additionally be ran through an optimizer like oxipng to further save on disk space.
2.9. Do not transliterate text
Languages should be written in their native script as loss of nuance and errors are very common in transliteration. Sort tags should be used for transliterations.
2.10. Do not alter audio files
Do not edit, normalize, equalize, compress, amplify, fade or otherwise destructively and irreversably alter audio files.
2.12. Avoid rips of analog media
Analog audio often has noticable artifacts due to the nature of the ripping process. As no two analog rips are the same the rips also cannot feasibly be compared for accuracy.
2.13. Avoid MQA
MQA is a proprietary lossy audio codec misleadingly stored in a PCM container.
3. Tags
3.1. Properly utilize audio tags
Filling metadata helps music players to organize and display releases in a meaningful way. Due to the inherent limitations of what tags can reasonably convey, it is important to carefully consider what info is stored and how music players will preceive it.
Descriptions of common tags and ways to utilize them:
Tag Name | Description |
---|---|
ALBUM |
Album scoped, single-valued
Title of the specific edition of the release with an edition suffix where applicable. |
ALBUMARTIST |
Album scoped, single-valued
All album artists, except featured artists, with all join phrases as a single value. This ensures the directory structure and music players will group the releases together ignoring ephemeral pairups. |
ARTIST |
Track scoped, single-valued, multi-valued
All track artists. Depending on how scrobbling works in the music player of choice it might be better to treat it multi-valued tag, otherwise it can be treated as single-valued to preserve the relationships between the artists. |
BARCODE |
Album scoped, multi-valued
Release specific 14, 13, 12, 8 or 6 digit UPC/EAN barcodes. Usually found on the back of the case. |
CATALOGNUMBER |
Album scoped, multi-valued
Release specific catalog numbers. Usually found on the side of the case or on the disc itself. |
COMPOSER |
Track scoped, multi-valued
All non-fictional composers as mulitple values. Especially important for classical music. |
DATE |
Album scoped, single-valued
Release date of the specific edition in "YYYY-MM-DD", "YYYY-MM" or "YYYY" format, depending on what parts are known. |
DISCNUMBER |
Disc scoped, single-valued
Number of the disc, not padded. |
DISCSUBTITLE |
Disc scoped, single-valued
Mainly used for box sets where each disc represents a different album or named collection of tracks. For example the Grand Theft Auto: San Andreas Official Soundtrack - Box Set release with each disc representing a radio station. |
DISCTOTAL |
Album scoped, single-valued
Total number of discs in the release, not padded. |
GENRE |
Track scoped, multi-valued
In addition to the standard genres, use the following genres to provide more information about a track: |
ISRC |
Track scoped, single-valued
Used to help identifying tracks on streaming services and web stores. |
MEDIA |
Album scoped, multi-valued
All mediums contained in the specific release as multiple values. Example values: Scoped to an album to allow searching for releases containing specific mediums. As music players often do not support directly including video files in a library, this makes it possible to find releases that include blurays. |
MUSICBRAINZ_*ID |
Various scopes, multi-valued
Associates files with a concrete data source in the form of MusicBrainz database entries. MusicBrainz contains extensive data about tracks and releases that is typically not stored in tags due to its complex nature. Software can use these MusicBrainz id tags to fetch additional information for display, management or scrobbling purpose. |
ORIGINALDATE |
Album scoped, single-valued
Original release date in YYYY-MM-DD format. Should be the release date of the oldest edition of the release aka the first release in a release group. |
RELEASETYPE |
Album scoped, multi-valued
Example values: |
SOURCE |
Album scoped, single-valued
Name of the streaming service where the files stored on disk are from. |
TITLE |
Track scoped, single-valued
Title of a track. |
TRACKNUMBER |
Track scoped, single-valued
Number of a track on a medium, not padded. |
TRACKTOTAL |
Disc scoped, single-valued
Total count of tracks on a medium, not padded. |
URL |
Album scoped, single-valued
Address on the streaming service where the files stored on disk originate from in the shortest form possible to save space. Principles:
Examples: |
3.2. Always use "Various Artists" for various artists
This way all releases with various artists will be consistent and grouped together.
3.3. Do not embed images
Embedding images into each audio file of a release uses more space than storing a single image file in the folder. Additionally, embedded images are usually of lower quality to reduce space usage, despite occupying approximately the same amount of space as a single high-quality image file.
3.4. Do not include featured artists in track titles
An artist is not part of the song title.
3.5. Do not use fancy Unicode symbols for common punctuation marks
Using Unicode variants of common symbols makes management and search tasks more challenging due to the characters looking extremely similar and not always being fuzzy matched to their common counterparts.
Replacement map from fancy Unicode symbols to ASCII:
Name | From | To |
---|---|---|
NO-BREAK SPACE |
◌ |
|
LEFT SINGLE QUOTATION MARK |
‘ |
' |
RIGHT SINGLE QUOTATION MARK |
’ |
' |
SINGLE LOW-9 QUOTATION MARK |
‚ |
, |
SINGLE HIGH-REVERSED-9 QUOTATION MARK |
‛ |
' |
LEFT DOUBLE QUOTATION MARK |
“ |
" |
RIGHT DOUBLE QUOTATION MARK |
” |
" |
DOUBLE LOW-9 QUOTATION MARK |
„ |
" |
DOUBLE HIGH-REVERSED-9 QUOTATION MARK |
‟ |
" |
HYPHEN |
‐ |
- |
NON-BREAKING HYPHEN |
‑ |
- |
EN DASH |
– |
- |
EM DASH |
— |
- |
FIGURE DASH |
‒ |
- |
HORIZONTAL BAR |
― |
- |
ONE DOT LEADER |
․ |
. |
TWO DOT LEADER |
‥ |
.. |
HORIZONTAL ELLIPSIS |
… |
... |
DOUBLE EXCLAMATION MARK |
‼ |
!! |
DOUBLE QUESTION MARK |
⁇ |
?? |
FRACTION SLASH |
⁄ |
/ |
DIVISION SLASH |
∕ |
/ |
WAVE DASH |
〜 |
~ |
FULLWIDTH TILDE |
~ |
~ |
FULLWIDTH LEFT PARENTHESIS |
( |
( |
FULLWIDTH RIGHT PARENTHESIS |
) |
) |
FULLWIDTH LEFT SQUARE BRACKET |
[ |
[ |
FULLWIDTH RIGHT SQUARE BRACKET |
] |
] |
FULLWIDTH LESS-THAN SIGN |
< |
< |
FULLWIDTH GREATER-THAN SIGN |
> |
> |
3.6. Do not use artist sort names as artist names
Family names, honorary titles, definite articles, "The", "DJ", etc can only be moved to the end in sort tags like ARTISTSORT and ALBUMARTISTSORT as those tags are specifically meant for customizing values to ensure a specific sort order. Eveywhere else the artist should be left as it is depicted in official sources. For extensive information on constructing sort names MusicBrainz Artist Sort Name Style Guide.
Updated 2023-03-23