Prison Architect

Prison Architect

Lenticular Float32 1.2b BETA
33 kommentarer
megatails 4. juli 2021 kl. 22:34 
not working for me. with only the mod, there are a lot of squares appearing on the scenario, and when I apply a texture mod, it becomes a bunch of white squares
jjwalker  [ophavsmand] 23. feb. 2021 kl. 18:21 
I am going to finish the last bit of this and make it 2.0 status soon...

I kind of got fed up with Paradox and just ditched it and had to be away for a while. I do believe however, my displeasure with Paradox should have bearing on completing this.

So stay tuned for 2.0 to be available (This item will automatically to 2.0 when complete).

I hope the few of you using this fix of Paradox's complete disregard for the PC community and fixing their game enjoyable and pleasant. I did it for ya'll, I did it to try and save a game I really love. But the has to be an end.

Thank you Friends!
jjwalker  [ophavsmand] 12. jan. 2021 kl. 9:06 
@SagLysa88

No, this is strictly for rendering and performance. Specifically, I added dynamic lighting and effects while improving performance by roughly 300% in some cases.

Your cooks might still be buggy, but at least the game will render quickly :)
SagLysa88 11. jan. 2021 kl. 18:46 
Sorry if this has already been asked or addressed, but will this solve or at least help with thee issues with the chefs and the kitchen bugs?
jjwalker  [ophavsmand] 31. dec. 2020 kl. 8:40 
I am really trying to polish this up and if it is bug free, it is going to go straight v1.3 instead of 1.2 Beta Rev. F4
I think that once it is done, it will be done. I don't know what else I could possibly add to it at this point without either 1) disturbing it's 2D nature, it's a 2D sprite game and that is what makes it fun. 2) Detracting from what the original goal was.
This mod was originally supposed to just make the game playable beyond the smallest map, and as usual with any project I do, it got expanded but I think it was for the better. I think once this is done, it'll be really the perfect balance of speed and visuals without ruining the 2D sprite atmosphere. So far, I have made the game 300% faster in some cases and the additional eye candy is being done carefully to keep it that way. I just don't see what else I could add or if I added it, keeping that speed increase... So soon, I want this to be great, bug free, and complete.
- I'll keep ya'll posted. :)
jjwalker  [ophavsmand] 23. dec. 2020 kl. 20:17 
1.2 Rev. F4 progress notes:
I am delayed again because I have been implementing some neat stuff.
I've added..
-Shader based MSAA - Very fast because I am sampling neighbor pixels as they run through the GPU's pipeline, so thus it is very cheap. Pixels are shaded in 2x2 batches so sampling a neighboring pixel while "in-flight" is extremely fast. This greatly reduces texture aliasing while moving the camera.
- Variance based shadows - in progress and while working, need tweaking.
- Per pixel shadows and post process filter - I found I can use the final pass used for shadows for post process. I have a form of self shadowing going on which was the result of a mistake (one I am glad I made, lol). Pictures are in the link below and the results are startling.

https://forum.paradoxplaza.com/forum/threads/lenticular-float32-mod-official-thread.1445090/post-27185216
jjwalker  [ophavsmand] 22. dec. 2020 kl. 8:03 
Another day late...

I am trying to solve a floating point rounding error but if I am unable to do so by this afternoon, it is getting the update. The error does not break anything and you won't notice it unless you are looking for it. I say error, it really isn't, it's just an oddity caused by floating number rounding.
jjwalker  [ophavsmand] 21. dec. 2020 kl. 8:05 
A sudden migraine stopped me in my tracks yesterday...

For sure being updated today, see my comment below this one.
jjwalker  [ophavsmand] 20. dec. 2020 kl. 13:53 
Update on 1.2 Beta rev. F4
I believe (hoping) that it will be ready late late tonight, don't forget I am UTC-6.
So far the changes are as follows
- 80% optimized (as of right now as I am typing)
- FAR better GPU pipeline utilization - 50% less VRAM usage and eliminated multiple samples of the same texture for a single fragment
- Long warp stalls almost 0 (stalls happen, but less is better) Warps are 32 thread batches that execute simultaneously. The AMD equivalent is "wavefront".
- SM cores achieve up to 98% saturation - Derived from (Warps * SM Cores - unused cores).
The post below this details a few things better, but I'll get further into it when the update is released.
jjwalker  [ophavsmand] 20. dec. 2020 kl. 9:19 
So, 1.2 Beta rev. F4 has almost exclusively been optimization and frame profiling. Where I am at this very moment, the GPU pipeline is VERY efficient. Before ( rev. F3 and prior) it was split into 3-4 chunks computing or otherwise stalled. I have got this down to essentially 2 chunks, which is really the minimum anyway. One "draw vertices" and one really large "Shade pixels". That is simplified but it translates to a savings of almost 50% less VRAM usage. Our "long warp stall" is 0% most of the time, down from 8-15% before. FPS and frame time are essentially the same but we are also not missing any monitor refreshes (less gsync or freesync, your monitor refreshes whether you are ready for it to do so or not).
jjwalker  [ophavsmand] 19. dec. 2020 kl. 7:58 
Ok, as it stands right now 1.2 Beta F3 as it is takes...
Frame Time to Draw -- CPU avg = 0.03ms - Sigma = 5.12ms -- GPU avg = 0.06ms - Sigma = 11.27ms --

Total of 16.39ms at 3X game speed to draw a single frame with SSAA and very large map of 160,000 tiles. That's a map 7 times larger than 95% of people play on, not to mention, SSAA means our actual frame resolution is 3840x2160. I say not bad at all. We are going to make this faster though. Once you add latency and other stuff that translates to a Frame Rate of 63.9fps
What is important to note the very large difference in average versus our total draw time. That gap is because of the division, as everything has to wait for the divisions to complete. Frames aren't drawn in a linear fashion, as things complete, they wait in the frame buffer until everything else finishes.
jjwalker  [ophavsmand] 18. dec. 2020 kl. 8:24 
It's amazing that in 2020 both CPU's and GPU still suck at division. I use a few linear inverse proportions for global illumination (8 total) and 4 quadratic for attenuation. The quadratic equations are highly optimized at the hardware level and even in their current form, the compiler does a good job with them. However, you wouldn't how hard of a time y=1/x is for a processing unit (GPU worst case, 40 cycles. CPU 100). If I just changed them to y=1/x^2 it'd be totally different, but I am really hell bent on keeping these linear (without a curve). I guess I'll have to run it into assembly and look at it and see how much it's costing us, but do a quick google search on "multiplicative inverse" or "floating point reciprocal" if your curious. y=1/x costly beast!
jjwalker  [ophavsmand] 17. dec. 2020 kl. 4:18 
*(continuation of the below comment)*
As I have said many times (in the code comments anyway) I cannot change PA.exe, so I am stuck with what I have to work with. Well, I am unable to escape some of the Fixed Functions PA.exe delivers and these functions have to remain. Going to OpenGL 3.3 means I can still use the fixed functions AND take advantage of the new features introduced. Why not OpenGL 4.0 or higher? Doing so would eliminate a large swathe of people using pre-DX11 hardware (example, you'd need a NVIDIA GTX400+ card).

So today, I'll be updating to OpenGL 3.3 and provide further details on changes when uploaded.
jjwalker  [ophavsmand] 17. dec. 2020 kl. 4:18 
So I have changed direction a bit and since I am character constrained, we'll keep this short.
The next update will place us into the realm of OpenGL 3.3 and I plan to stay there. When this was updated to OGL 3.0 one gigantic performance leap was gained because it will now branch execute the shaders. If a conditional statement is used, = True we execute = False we skip over, whereas before conditionals still had to be executed even if we ignored the result, so branching saves a lot of resources and unneeded processing.
jjwalker  [ophavsmand] 16. dec. 2020 kl. 15:30 
@murgh, lol!

It's going to be something, I'll find it...
murgh 16. dec. 2020 kl. 11:39 
This mod: 'Hello World'.

My Intel 4400: 'No'.

Debug.txt: 'sup, nothing to see here'.
jjwalker  [ophavsmand] 16. dec. 2020 kl. 11:22 
Updated to 1.2b Rev F2 - See notes in Description.

I think I may write in some Gaussian Blur for the shadows (soft shadows essentially) and/or implement filtering (Bilinear or Bicubic) for the textures. The normal mapping introduces texture aliasing by nature so this will soften this up. I'll implement it and see what performance penalty there is, but it shouldn't in theory be more than the 2,000 to 4,000 GPU cycles I just freed up. I may also include an options file so that it can be turned on or off.
jjwalker  [ophavsmand] 16. dec. 2020 kl. 0:24 
I caught a few more bugs tonight and optimized a few things. Update will be uploaded tomorrow.
jjwalker  [ophavsmand] 15. dec. 2020 kl. 15:39 
@murgh,
That's bizarre! There is nothing to see because it is compiling without error, but obviously there IS in fact a vertex shader issue (thus all of the misplaced textures and lights). It is related to the Intel driver, that I do know, it's just a question of how to find specifically what in the vertex shader that is making it upset.
I sent you friend request on here so we can communicate easier.
Tonight I will go through the supported function list for the HD 4400 and see if I am using a function that intel doesn't support.
In the mean time, go to 1.2 Beta's mod folder and open lightmap.fs in notepad and change the very top line from #version 130 to #version 120 and save it, then load up the game.
murgh 15. dec. 2020 kl. 15:16 
Like that? There seems to be nothing of interest in the rest of debug.
murgh 15. dec. 2020 kl. 15:15 
DataRegistry DUMP:
WindowManager attempting to create window at 1600x900 windowed
OpenGL Vendor : Intel
OpenGL Renderer : Intel(R) HD Graphics 4400
OpenGL Version : 4.3.0 - Build 20.19.15.5166
OpenGL GLSL : 4.30 - Build 20.19.15.5166
Windows 8.1 Per-monitor DPI reported: 96 x 96
Parsing archive main.dat...
Parsing archive at path 'main.dat'
DONE
completed in 9052ms 9.05 Seconds
Parsing archive sounds.dat...
Parsing archive at path 'sounds.dat'
DONE
completed in 2197ms 2.20 Seconds
Using Native Win32 Condition Variables
There are 100 mod sub directories
Compiled with libpng libpng version 1.6.19 - November 12, 2015
(Running with version 1.6.29)
Warning: Loading a very high res image (4096x2048)
Warning: Loading a very high res image (4096x2048)
OpenGL using glGenerateMipmaps to generate mipmaps.
murgh 15. dec. 2020 kl. 15:15 
Loading user sprite images for path: C:\Users\User\AppData\Local/Introversion/Prison Architect/mods/mods_Murgh_CombiPack_ChristmasRock/data/sprites.png
Warning: Loading a very high res image (8192x4096)
Loading user sprite images for path: data/sprites.png
Failed to get sprites.png for data/sprites.png
PackRectangles (2 spaces) packing 1 rectangles
Packing:
WorldRenderer::LoadUserSpriteImages Compositing image C:\Users\User\AppData\Local/Introversion/Prison Architect/mods/mods_Murgh_CombiPack_ChristmasRock/data/sprites.png (8192 x 4096) to 0, 0
Created FrameBuffer of size 64 x 64 in 35ms
Object spritebank composite took 2149ms
Warning: Loading a very high res image (2688x256)
Warning: mipmaps requested for non-power-of-two image (2688x256), will break on OpenGL ES
Warning: Loading a very high res image (4096x2048)
murgh 15. dec. 2020 kl. 15:15 
WorldRenderer: vexCellTypes initialised in 0ms : 16000 triangles (vex,tex), 0.9 MBytes
ShaderOpenGL successfully compiled : LightMapNoFOWNoColour
ShaderOpenGL successfully compiled : LightMapNoFOW
WorldRenderer: vexCell2ndLayer initialised in 0ms : 16000 triangles (vex,tex), 0.9 MBytes
WorldRenderer: m_vexPaddedWalls initialised in 0ms : 8000 quads (vex,tex), 0.6 MBytes
ShaderOpenGL successfully compiled : LightMapNoFOWNoTexture
WorldRenderer: vexDetails initialised in 0ms : 16000 triangles (vex,tex,col), 1.1 MBytes
ShaderOpenGL successfully compiled : SunShadows
Waited for prerender group to finish for 0.000000 seconds
Warning: mipmaps requested for non-power-of-two image (1000x1000), will break on OpenGL ES
Warning: mipmaps requested for non-power-of-two image (359x436), will break on OpenGL ES
jjwalker  [ophavsmand] 15. dec. 2020 kl. 14:57 
@murgh,

Do me a favor and load 1.2 Beta back up and send me your debug.txt from the appdata directory. I have 1.2 set up to flag all compiler errors and warnings and will tell me what the intel driver isn't able to deal with and exactly what line. That'll help me out TONS!

Looking at that picture, it is compiling but with vertex shader errors. That bright light in the top left hand corner is the sun and it will do that when the vertex shader gets broken.
murgh 15. dec. 2020 kl. 14:52 
How can one be bad at Assembly or not like it? :steamhappy:
Can't remember if I tried yet another version... brain lag for now.
jjwalker  [ophavsmand] 15. dec. 2020 kl. 14:48 
@murgh,
I actually had you in mind earlier and was reading about intel integrated graphics... Kronos, the developer of OpenGL flat out said intel drivers are garbage, lol.
Kronos also said that to fix compatibility issues, I should use ARB instead of GLSL... which means basically convert it to assembly language (um, no).

I'll see if I can find something that would port it over to ARB but I don't know how successful that would be nor if there would be any improvement (ARB is older than dirt). I'll do some research.

Did the 1.1 compatibility version I made work for you?
murgh 15. dec. 2020 kl. 14:37 
murgh 15. dec. 2020 kl. 14:33 
Still a no go for me:
https://gtm.steamproxy.vip/sharedfiles/filedetails/?id=2322231793
Keep up the good work!:lunar2019coolpig:
jjwalker  [ophavsmand] 15. dec. 2020 kl. 8:55 
I also forgot to mention, this isn't optimized yet. I am performing multi-cycle computation on some equations because I know they will work as written. For example my specular map (how shiny a surface should be) is written as " float spec_map((lmap.r + lmap.g + lmap.b) / 3.0);" which takes 3 cycles to complete, whereas "const B_Color vec3(1.0); float spec_map dot((vec3(lmap.rgb), B_Color) / 3.0); only takes 2 cycles.
I may optimize like the example above before going up another GL version, but I might not to avoid possible compiler error... not sure yet.
jjwalker  [ophavsmand] 15. dec. 2020 kl. 8:20 
@[RIP] SpecialistZerO
I was play testing last night and while there is still a slight framerate drop, the drop was negligible when snowing. I couldn't find any bugs so really I could take the Beta tag off.
My shaders already run faster than the base game but the OpenGL upgrade is in some instances a 200% increase in frame rate. Also, land expansion doesn't cause a large dip in performance either anymore.
Before I yank the Beta tag I'll let ya'll test it on some different hardware as Nvidia's compiler is rather forgiving (GTX 960 and 3900X here). Now that it is running in Core mode the compiler will completely halt on any errors so I don't foresee any issues.
That was how I was able to fix the bugs so fast. I THOUGHT Double 11/Paradox surely had a compiler pre-processing directive but nope... the base game defaults to 13yr old OpenGL 2.0 which is unbelievable. I know Introversion wrote it, so shame on them too, but Paradox can't tell me they bought the car without looking at it.
SpecialistZer0 15. dec. 2020 kl. 3:08 
Thank you for this @jjwalker I'll give it a go later tonight.
jjwalker  [ophavsmand] 14. dec. 2020 kl. 12:50 
Now updated!
jjwalker  [ophavsmand] 14. dec. 2020 kl. 12:39 
Shadows have been completely fixed along with the bugs mentioned, will be updating this shortly.