Implement shader compile counter (currently not translated, will change, need to pull changes.)
Remove event logic in favor of a single init function.
Thanks @MutantAura
Specifically, this setting causes the translation load core count to get reduced by two-thirds, for lower-power but still fast loading, and for unstable CPUs.
* Add Texture Size Capacity and 8GB Dram Build
* Update AutoDeleteCache.cs
* Dynamic Texture Cache (WIP)
* Change to float Multiplier, in-case it needs fine-tuning.
* Delete src/src.sln
* Update AutoDeleteCache.cs
* Format
* Fix Formatting
* Add DefaultTextureSizeCapacity and MemoryScaleFactor
- Also remove redundant New Lines
* Fix 4GB dram crashing
* Format newline
* Refractor
- Added Initialize() function to TextureCache and AutoDeleteCache
- Removed GetMaxTextureCapacity() function and instead added _maxCacheMemoryUsage
- Added private const MaxTextureSizeCapacity to AutoDelete Cache
- Added TextureCache.Initialize() to MemoryManager in order to fetch MaxGpuMemory at the right time.
- Moved and Changed Logger.Info for Gpu Memory to Logger.Notice and Moved it to PrintGpuInformation function.
- Opted to use a ternary operator for the Initialize function, I think it looks cleaner than bunch of if statements.
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* maxMemory to CacheMemory, use Clamp instead of Ternary. Changed MinTextureCapacity 1GiB to 512 MiB
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Format comment
* comment context
* Increase TextureSize capacity for OpenGL back to 1024
- Added a new const ulong for OpenGLTextureSizeCapacity
* Fix changes from last commit.
* Adjust last OpenGL changes.
* Remove garbage VSC file
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* refactor(perf): pass MemoryOwner<byte> around as itself rather than IMemoryOwner<byte>
* fix(perf): get span via MemoryOwner<byte>.Span property instead of through Memory property
* fix: incorrect comment change
* Allow creating texture aliases on texture pool
* Delete old image format override code
* New format incompatible alias
* Missing bounds check
* GetForBinding now takes FormatInfo
* Make FormatInfo struct more compact
* Add area sampling scaler to allow for super-sampled anti-aliasing.
* Area scaling filter doesn't have a scaling level.
* Add further clarification to the tooltip on how to achieve supersampling.
* ShaderHelper: Merge the two CompileProgram functions.
* Convert tabs to spaces in area scaling shaders
* Fixup Vulkan and OpenGL project files.
* AreaScaling: Replace texture() by texelFetch() and use integer vectors.
No functional difference, but it cleans up the code a bit.
* AreaScaling: Delete unused sharpening level member.
Also rename _scale to _sharpeningLevel for clarity and consistency.
* AreaScaling: Delete unused scaleX/scaleY uniforms.
* AreaScaling: Force the alpha to 1 when storing the pixel.
* AreaScaling: Remove left-over sharpening buffer.
* Vulkan: Feedback loop improvements
This PR allows the Vulkan backend to detect attachment feedback loops. These are currently used in the following ways:
- Partial use of VK_EXT_attachment_feedback_loop_layout
- All renderable textures have AttachmentFeedbackLoopBitExt
- Compile pipelines with Color/DepthStencil feedback loop flags when present
- Support using FragmentBarrier for feedback loops (fixes regressions from https://github.com/Ryujinx/Ryujinx/pull/7012 )
TODO:
- AMD GPUs may need layout transitions for it to properly allow textures to be used in feedback loops.
- Use dynamic state for feedback loops. The background pipeline will always miss since feedback loop state isn't known on the GPU project.
- How is the barrier dependency flag used? (DXVK just ignores it, there's no vulkan validation...)
- Improve subpass dependencies to fix validation errors
* Mark field readonly
* Add feedback loop dynamic state
* fix: add MoltenVK resolver workaround
fix: add MoltenVK resolver workaround
* Formatting
* Fix more complaints
* RADV dcc workaround
* Use dynamic state properly, cleanup.
* Use aspects flags in more places
* chore: replace `ByteMemoryPool` usage with `MemoryOwner<byte>`
* refactor: `PixelConverter.ConvertR4G4ToR4G4B4A4()` - rename old `outputSpan` to `outputSpanUInt16`, reuse same output `Span<byte>` as newly-freed name `outputSpan`
* eliminate temporary buffer allocations
* chore, perf: use MemoryOwner<byte> instead of IMemoryOwner<byte>
Vulkan spec states that input topology should always be PatchList when a tessellation pipeline is present. The AMD GPU on windows crashes so hard it BSODs the machine if this isn't the case, so it's forced here just in case.
I'm not sure what providing a different topology here would even do, as you'd think it would always be a patch list input.
This barrier has always been missing, but it only became apparent when #7012 merged.
I also added some barriers in case the target buffer used here is used by other commands, though right now it isn't.
Fixes a regression where water would turn white on AMD GPUs with the proprietary driver. May fix other issues on this driver.
* More guarantees for buffer correct placement, defer guest requested buffers
* Split RP on indirect barrier rn
* Better handling for feedback loops.
* Qualcomm barriers suck too
* Fix condition
* Remove unused field
* Allow render pass barriers on turnip for now
* Do not use template updates for buffer textures and buffer images
* No need to do it for images
* Simply buffer texture existence check
* Pipeline is now unused on DescriptorSetUpdater
* Fix some validation errors
* Whitespace correction
* Resolve some runtime validation errors.
* Whitespace
* Properly fix usage realted validation error by setting Extended Usage image creation flag.
* Only if supported
* Remove checking extension for features that are core functionality of Vulkan 1.2
* Ensure descriptor sets are only re-used when all command buffers using it have completed
* Fix some SPIR-V capabilities
* Set update after bind flag if we exceed limits
* Simpler fix for Intel
* Format whitespace
* Make struct readonly
* Add barriers for extra set arrays too
* Dynamic state for Depth Bounds should not be passed to PipelineDynamicStateCreateInfo as the command to set them is never called.
Do not pass pointer to viewport and scissor as those dynamic states should be supported on all devices.
Same as above for DepthBias values.
* Code Review Suggestion
* Pipeline derivation is not implemented and is not suggested.
* Depth Bounds are not used.
* feat: add new types MemoryOwner and SpanOwner
* use SpanOwner instead of new array allocation
* change for loop condition to `fences.Length` instead of `count` to elide Span boundary checks on `fences`
* Report base and extra sets from the backend
* Pass texture set index everywhere
* Key textures using set and binding (rather than just binding)
* Start using extra sets for array textures
* Shader cache version bump
* Separate new commands, some PR feedback
* Introduce new manual descriptor set reservation method that prevents it from being used by something else while owned by an array
* Move bind extra sets logic to new method
* Should only use separate array is MaximumExtraSets is not zero
* Format whitespace
* intel workaround
built on top of the amd workaround
* forgot to update the note
* Logic Change
Enabled workaround for all vendors that aren't nvidia
* Applied Suggestions
* GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings
Essentially retreading #4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there.
Performance greatly improved in Hyrule Warriors: Age of Calamity. Performance notably improved in TOTK (average). Performance for BOTW restored to how it was before #4911, perhaps a bit better.
- Rewrites a bunch of buffer migration stuff. Might want to tighten up how dispose stuff works.
- Fixed an issue where the copy for texture pre-flush would happen _after_ the syncpoint.
TODO: remove a page from pre-flush if it isn't flushed after a certain number of copies.
* Add copy deactivation
* Fix dependent virtual buffers
* Remove logging
* Fix format issues (maybe)
* Vulkan: Remove backing swap
* Add explicit memory access types for most buffers
* Fix typo
* Add device local force expiry, change buffer inheritance behaviour
* General cleanup, OGL fix
* BufferPreFlush comments
* BufferBackingState comments
* Add an extra precaution to BufferMigration
This is very unlikely, but it's important to cover loose ends like this.
* Address some feedback
* Docs
* Add support for bindless textures from shader input (vertex buffer)
* Shader cache version bump
* Format whitespace
* Remove cache entries on pool removal, disable for OpenGL
* PR feedback
* rebase
* add methods Ryyjinx.Common EmbeddedResources and SteamUtils
* GAL changes - change SetData() methods and ThreadedTexture commands to use IMemoryOwner<byte> instead of SpanOrArray<byte>
* Ryujinx.Graphics.Texture: change texture conversion methods to return IMemoryOwner<byte> and allocate from ByteMemoryPool
* Ryujinx.Graphics.OpenGL: update ITexture and Texture-like types with SetData() methods to take IMemoryOwner<byte> instead of SpanOrArray<byte>
* Ryujinx.Graphics.Vulkan: update ITexture and Texture-like types with SetData() methods to take IMemoryOwner<byte> instead of SpanOrArray<byte>
* Ryujinx.Graphics.Gpu: update ITexture and Texture-like types with SetData() methods to take IMemoryOwner<byte> instead of SpanOrArray<byte>
* Remove now-unused SpanOrArray<T>
* post-rebase cleanup
* PixelConverter: remove unsafe modifier on safe methods, and remove one unnecessary cast
* use ByteMemoryPool.Rent() in GetWritableRegion() impls
* fix formatting, rename `ReadRentedMemory()` to `ReadFileToRentedMemory()``
* Texture.ConvertToHostCompatibleFormat(): dispose of `result` in Astc decode branch
* Add support for large sampler arrays on Vulkan
* Shader cache version bump
* Format whitespace
* Move DescriptorSetManager to PipelineLayoutCacheEntry to allow different pool sizes per layout
* Handle array textures with different types on the same buffer
* Somewhat better caching system
* Avoid useless buffer data modification checks
* Move redundant bindings update checking to the backend
* Fix an issue where texture arrays would get the same bindings across stages on Vulkan
* Backport some fixes from part 2
* Fix typo
* PR feedback
* Format whitespace
* Add some missing XML docs
* Move some init logic out of PrintGpuInformation, then delete it
* Disable push descriptors for Intel ARC on Windows
* Re-add PrintGpuInformation just to show it in the log
AMD GPUs (possibly just RDNA 3) could hang with the previous value
until the MaxQueryRetries was hit.
Fix#6056
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
Disables push descriptors on older NVIDIA GPUs (10xx and below), since it is clearly broken beyond comprehension. The existing workaround wasn't good enough and a more thorough one will probably cost more performance than the feature gains. The workaround has been removed.
Fixes#6331.
If more than 16 barriers were queued at one time, the _queuedBarrierCount would no longer match the number of remaining barriers, because when breaking out of the loop consuming them it deleted all barriers, not just the 16 that were consumed.
Should fix freezes that started occurring with #6240. Fixes issue #6338.
* WIP barrier batch
* Add store op to image usage barrier
* Dispose the barrier batch
* Fix encoding?
* Handle read and write on the load op barrier.
Load op consumes read accesses but does not add one, as the only other operation that can read is another load.
* Simplify null check
* Insert barriers on program change in case stale bindings are reintroduced
* Not sure how I messed this one up
* Improve location of bindings barrier update
This is also important for emergency deferred clear
* Update src/Ryujinx.Graphics.Vulkan/BarrierBatch.cs
Co-authored-by: Mary Guillemard <thog@protonmail.com>
---------
Co-authored-by: Mary Guillemard <thog@protonmail.com>
* Fix Push Descriptors
* Use push descriptor templates
* Use reserved bindings
* Formatting
* Disable when using MVK
("my heart will go on" starts playing as thousands of mac users shed a tear in unison)
* Introduce limit on push descriptor binding number
The bitmask used for updating push descriptors is ulong, so only 64 bindings can be tracked for now.
* Address feedback
* Fix logic for binding rejection
Should only offset limit when reserved bindings are less than the requested one.
* Workaround pascal and older nv bug
* Add GPU number detection for nvidia
* Only do workaround if it's valid to do so.