Y grain AVX2 implementations
Add film grain checkasm tests
Split out film grain block functions into a DSPContext
obu: fix deriving render_width and render_height from reference frames
Silence some clang-cl warnings
x86: Fix buffer overead in mc put
x86: Increase precision of the final inverse ADST transform stages
arm64: itx: Do the final calculation of adst4/adst8/adst16 in 32 bit to avoid too narrow clipping
Prefer __builtin_unreachable() over __assume() on clang-cl
Fix clang-cl assertion warning
arm: Fix assembling with older binutils
TileContext: reorder scratch buffer to avoid conflicts
CI: use "needs:" to break the static build, test stage dependency
Apply high-bitdepth adjustment of pixel index after delta calculation
Use linear interpolation for high bit-depth pixel values
Fix bugs in film grain generation
arm: mc: Making code style consistent
arm: mc: Push fewer registers in w_mask
arm: mc: Remove an unused instruction in w_mask
Check absolute tile positions in sb-to-tile_idx table generation
Use 64-bit integers for warp_affine mvx/mvy calculations
x86: Fix inverse ADST transform overflows
Optimize coef ctx calculations
Consolidate horizontal scan tables
Change scan tables from int16_t to uint16_t
Utilize the constraints in assertions to improve code generation
arm64: mc: NEON implementation of w_mask_444/422/420 function
arm64: mc: NEON implementation of blend, blend_h and blend_v function
Prefer `do {} while (0);` over `while (0);`
Cosmetics: CDF tables
x86: Add an msac function for coefficient hi_tok decoding
Add msac optimizations
Remove unused CDF:s
dav1dplay: abort if no input filename is provided
meson: move dav1dplay to a new examples section
decode_coefs reuse lossless variable
Unroll hi_token loop in decode_coeff
Quick out if seg_id == 0 in get_prev_frame_segid
Avoid CDF overreads in gather_top_partition_prob()
Set thread names on MacOS
Set thread names on Windows 10
arm: mc: Speed up due to memory alignment in ldr/str instructions
checkasm: Catch SIGBUS in addition to the other signals
Export frame ITU-T T.35 Metadata
Improve wiener filter C implementation using loop interchange
tools: player: Add missing string.h header
Set thread names on BSDs
vsx: Add shorter types and unpack helpers
vsx: Set the correct alignment constraints
Update NEWS for 0.4.0
Change SDL include in dav1dplay
arm: mc: neon: Merge load and other related operations in blend/blend_h/blend_v functions
arm: mc: neon: Reduce usage of general purpose registers in blend/blend_v functions
arm: mc: neon: Use vld with ! post-increment instead of a register in blend/blend_h/blend_v function
tools: add a simple player example
Fix handling of some memory allocation failures
Set thread names on Linux
arm: mc: NEON implementation of w_mask_444/422/420 function
dav1d_fuzzer: use Dav1dSettings.frame_size_limit instead of a custom picture allocator
Fix memory leak in dav1d_submit_frame()
obu: also check frame_size_limit with Frame Header OBUs
Improve robustness of handling malloc failures
Correctly return an error on malloc failure
Fix potential memory leak
arm: mc: neon: Improvement in blend_v function
Reduce the size of frame threading buffers
Consolidate scratch buffers
build: fix meson deprecation warning
checkasm: msac: Add verbose printouts on failures
checkasm: cdef: Add verbose prints for output data (and relevant input)
checkasm: looprestoration: Use checkasm_check*
checkasm: loopfilter: Use checkasm_check*
checkasm: ipred: Use checkasm_check*
ci: add test stage for clang armv7a build
checkasm: mc: Use checkasm_check_* for better debuggability
arm64: itx: Add NEON optimized inverse transforms
tools: Use DAV1D_ERR for strerror calls
include: Consistently use DAV1D_ERR in docs
checkasm: itx: Add verbose printouts for the pixel differences
checkasm: Add functions for printing pixel buffers
arm: mc: Move the blend functions up above put/prep
arm64: Consistently name macro arguments tX for temporaries in transposes
cli: use mach_absolute_time as fallback for clock_gettime on darwin. Fixes #283
arm:mc: NEON implementation of blend, blend_h and blend_v function
checkasm: Add an option to benchmark the C code as well
checkasm: Add a --help option to checkasm
checkasm: Add a readtime impl for ppc
Initial PowerPC support
meson: Look for librt if clock_gettime isn't found without it
meson: simplify a few checks for x86 targets
x86: include config.asm in x86inc instead of every asm file
checkasm: Check for __ARM_ARCH >= 7 for the arm cpu timer inline assembly
CI: Added ppc64le build and test jobs
Update NEWS for 0.4.0
output: automatically use null muxer for /dev/null
checkasm: Fix out-of-bounds read in warp8x8 tests
x86: Optimize warp8x8 AVX2 asm