The preview data is not shown because the data was published by an older version of Exploratory Desktop.
body
(Number of Rows)
`apiVersion: security.istio.io/v1beta1`
`kind: AuthorizationPolicy`
`metadata:`
`name: external-allow-to-restricted-urls`
`namespace: test-ns`
`spec:`
`# allow-list for the identities that can call the host`
`selector:`
`matchLabels:`
`app.kubernetes.io/instance: test-adapter`
`app.kubernetes.io/name: test-adapter`
`action: ALLOW`
`rules:`
`- from:`
`- source:`
`principals: ["cluster.local/ns/test-ns/sa/test-adapter"]`
`when:`
`- key: connection.sni`
`values:`
`- "*.googleapis.com"`
`- "*.amazonaws.com"`
`- "*.datadoghq.eu"`
1
https://imgs.xkcd.com/comics/standards.png
1
Mechanical Keyboards are pretty cool if you get into them. Consumer model “Mechanical Keyboards” on Amazon are usually ok but they’re usually trying to sell a buzzword. The enthusiast boards are very customizable tho where you can hot swap switches and key caps to your liking. Beyond manufactured enthusiast boards are pcb boards you can solder together with a micro controller and are incredibly customizable with QMK firmware. You can also get exotic boards like splits and macro pads.
For a gift tho they aren’t great. The cheap consumer ones aren’t very function rich or customizable. The enthusiast ones are extremely expensive. The DIY ones can be cheap and fun to build but you have to source all the components, do all the soldering, and program the firmware which takes time and effort.
1
"Also DevOps engineer is not a thing"
It is, actually, just a weird way to pronounce site reliability engineer really.
1
"another one". Missed opportunity to say
Yet Another Modelling Language?
1
"Avoid it like the plague!" - someone who has to use bitbucket because our CEO has dinner with their CEO.
\---
Using bitbucket as a git repo is dreadful, we face tons of issues with them on a weekly basis. Their CI tool is good and if your CIs are not too complex it's excellent.
Keep using GitLab if you can, that would be your best option. Moving from Gitlab to Jenkins is moving in the opposite direction. Jenkins although is a great tool is not keeping up with new-gen tools.
Github it's great never had an issue with it, either personal or enterprise and Github Actions I heard good things but haven't touched it yet.
1
"Definitely not what you think it is."
1
"FAANG-level scaling" seems to be a distinction without a difference from any other large company like Wal-Mart for example. What extra complexity are you referring to? Multi-region deploys? Multi-cloud?
1
"Fortunately Cloud Engineers have not been outomated out of the job yet."
haha I guess that indeed is a good problem that I'm facing...Job security :)
But yeah I figured...Just wanted to give it a shot make sure I've done everything I can to resolve my issue. I guess I will have to look at this problem from a different angle.
Thank you for your reply!
1
"Good judgment comes from experience, and experience comes from bad judgment." lol
1
"I can use bash" doesn't inspire confidence in expertise. I'm not trying to shit on you, I just can't imagine being in the software industry for 25 years and not knowing at least one language inside and out.
When I say bespoke software, I don't necessarily mean giant complicated apps, just something to accomplish a task in a safe, repeatable, and automated way. That could be a CI/CD bot that kicks off deployments when conditions are met, a script to automate upgrading thousands of Terraform files from 0.11 to 0.12 syntax, or a chatbot to automate some internal support queries.
> If a software development team can't figure out a problem with their code and they think asking a DevOps Engineer will do the trick
It's a rare dev that can trace a problem from their pod being in CrashLoopBackOff to the actual issue. Maybe yours are of higher caliber.
1
"I found documentation on this process in confluence but it turned out to be old and outdated"
"OK, did you update it?"
"No, why would I do that?"
1
"Storing scripts in line is the path to madness." 😂
1
“Library we used…bug…open file descriptors” -> we introduced a hack.
Hard hmmmmm
How about doing the impossible -> switch library or write a patch. Mindblown, “nobody does that with opensource, we just want to use stuff for free”, eyes rolling.
1
“The only time you care Public or private is when you are doing strict strict security […]”
So, always?
1
„˙sɹǝƃɹnq dılɟ ɹǝɥʇɐɹ plnoM ˙ou pɹɐH ˙oᴎ„
1
[Citus](https://docs.citusdata.com/en/v11.0/use_cases/timeseries.html) will work if you have something to shard the data by, like tenant_id, application_id, etc. Then it's just a matter of throwing enough hardware at it.
Since last release they are 100% open source, previously they had some very nice features locked behind enterprise pay wall.
If you want a fully managed solution then it's available on Azure, or you can use Azure Arc to still have a managed solution on our own Kubernetes that's outside of Azure.
1
[deleted]
5
[Here’s a guide I made for it if you are into DIY.](https://github.com/DIYCharles/DIYKeyboards) otherwise you can buy them prebuilt or using a pcb.
1
[removed]
144
[This FAQ](https://www.infracost.io/docs/faq/#how-does-infracost-work) describes how the tool works, how it parses cost-related params and uses those to lookup prices from the API. So the calculations are done on your machine :) As mentioned lower in that FAQ, you can also self-host the API.
1
[This video](https://www.youtube.com/watch?v=bB340S0tGf8) should clarify things for you.
* Github supports 2FA via U2F (see [here](https://github.blog/2021-05-10-security-keys-supported-ssh-git-operations/)), so using those is a no-brainer (for me).
* Backup you definitely want. How you do this is up to you. Copying to GitLab might or might not be a good idea.
* Basic security scanning is standard and should be part of pre-commit hook
* Not sure what "find a solution for easy transfer" is supposed to be.
What you should avoid: any long term keys. If anyone ever leaves your company, all you should need to do is to remove their permissions. If there is anything else to do, you should re-think how you can remove those extra steps. In AWS IAM it's the roles which you need to remove and that should be it. That makes on-boarding equally simple: add them to a specific role and that should cover all their needs.
And from experience I can say: Use a single authentication/authorization framework if you can. The more you have, the more trouble you'll get.
1
*checks post history to figure out why you're like this*
*sees why you're like this*
Nevermind, all the best!
1
*glorrphuhuhghuhphmupumm*
1
*I hardly touch bash.*
*But I do use a tonne of*
*Yaml, JSON and python*
\- Redmilo666
---
^(I detect haikus. And sometimes, successfully.) ^[Learn more about me.](https://www.reddit.com/r/haikusbot/)
^(Opt out of replies: "haikusbot opt out" | Delete my comment: "haikusbot delete")
1
*lacks argument relevant to the topic of discussion*
*asserts argument ad hominem*
Well, carry on!
1
*shiver* How did that 24/7 work exactly? Did the whole team just not sleep, or did you work in shifts?
1
/r/homelab and /r/ansible
1
\>>picked Jenkins but my employer dislikes the tool
agree with the employer
try something like Azure DevOps or GitLab
1
\~50 cores and 50K a year license fee.
1
> A lot of companies that were extremely successful as a startup failed in the growth phase, because they did not plan for (work towards) the end state.
I hear that a lot, but there were some stats here in Europe (can’t find the source anymore, sorry) that showed that less than 6% of businesses deaths were actually due to technical issues. Do you have examples of growing businesses dying because of their lack of scalability ?
I tend to believe more and more that this is , too, a cargo cult.
1
> Of course all these ideals are going to work when you have a fictional world set up to allow them to.
Wait this isn’t a review of Atlas Shrugged?
🥁
1
> Spring Boot will run Flyway migrations before the Spring Boot app starts.
That is not a very good idea.
I have written about alternatives here https://codefresh.io/blog/enterprise-ci-cd-best-practices-part-3/ (points 16,17,18)
1
> we have an environment per branch so splitting each of those in two would result in a lot of difficulty, I think.
That sounds like you've got configuration stored in git, which isn't ideal.
There is a process called "git deploy" which is pretty much what you describe, but I'd be looking to remove all configuration from the source repository, and "git deploy" to a separate, locked down repository that only contained the build artifact/s + configuration for each environment.
1
> “The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming."
The Art of Computer Programming by Donald Knuth
1
> Also when it comes to monitoring I quickly get confused. In terms of the stages of monitoring. Since I'm running on bare metal I have to monitor the hosts, then the VMs, and then the services in those VMs, and then the applications.
This is a fair point, but everything else pales in comparison to the question "Are my services up, or down?". Baby steps here. Get that one done, then start looking at the others.
> This is currently hard as well, because a lot of things are not even in git.
The point about reproducible server builds - I guess at the moment teams are creating binaries and then pushing them to some artifact repo, OR you are building servers manually?
Either way - you want some automation that takes a base OS image and deploys your workloads on it. I recommended docker compose, because containers are super easy to test and debug, are super lightweight so you can start and stop them easily, it's a prereq for going to k8s, etc. etc. Other people have talked about ansible and vagrant, and those are fine as well. The point is that your config for building your servers out should live in git so you can make incremental changes, easily revert, see everything that's done, etc. etc. You don't need it to be auto-triggered on commit to the main branch, but it's nice to have, the point is that you need to be able to do the following:
1. Know exactly what things have been done to create a server
2. Test an identical server outside of prod before it deploys to prod
3. Kill a production server and redeploy it, and be confident it will be the same afterwards.
4. Make incremental changes to a server template, push a button and recreate it.
It won't be easy building this out, and you might have to move stuff that currently is running locally on servers to some centralised NFS server or some external drive managed by your hypervisor tech.
But this is necessary for the above reasons.
1
> Another point, sometimes it’s easier to hire ex-FAANG engineers when you have the same platform.
This is it. FreeBSD jails is arguably better for some containerization workloads tha. Docker+Linux. But I can hire a guy who knows Docker in a day and a FreeBSD guy might take me a year to find. As just one example.
Hit the nail on the head.
1
> apples to oranges
But you can still compare them.
1
> bill was in the marines..
He gains ability to solve tech debt by shooting it .
Then grilling it Texas style
1
> But I am not sure about going back to the doctor and if it will help. I don't really like to talk about my issues.
This is of course up to you and far outside the scope of your original question but I'd encourage you to try and see a different doctor with a sleep diary in hand. Sometimes you see a doctor on their bad day and they're stressed or something and don't fully understand what you're trying to tell them. Doctors are humans too and can make mistakes.
Sleep is rough when it starts getting screwed up, but it's really key to your health. It's well worth trying to get it sorted even if it means a few uncomfrotable conversations with your boss and a different doctor.
1
> But it finally hit me. In a true startup mentality, you may not have time to find and evaluate the best solution for your environment. You just need some solution that will “work”. So along those lines, if it works for FAANG, it must be good.
So this I think perfectly gets at the heart of the issue, with the most egregious manifestation of this phenomenon, but I’d say this is the fallacy, not the right conclusion.
This is one of the most dominating “why” pressures of why this happens, this kind of thinking, but it leads you down an alley of paradoxical inefficiency. FAANG solutions are what they are because of the scale they’re at; we always tend to use scale as an unthinking positive attribute of a system, but it’s actually a double edged sword. Scale of support goes two ways. 1, you have the supported load, and 2, you have the structure which does the supporting. 1 is the benefit, 2 is the cost, which acts mostly like fixed/onetime/overhead.
It is this variable vs fixed cost relationship of each solution which (strongly) contraindicates FAANG solutions for startup problems.
Examples:
- micro services: now you’re reconfiguring your build as a docker file, so that you can kubernetes for your one-button-deployability, kube is awful to keep configured as yaml, so now you’re learning helm, helm is pretty awful as is too, so you’re learning the ins & outs of dynamic helm, now you’re figuring out a container provider, and straightening out your local implementation wrt the apis they actually support, now you’re poking holes to get your pre-existing service discovery solution to work right during the transitionary period that is extended for an unknown period of time out into the future, etc etc
- ci/cd: you’ve got to get a ci file going, a ci provider (although now GHA has greatly diminished this cost), the scripts & checks, the organizational knowledge on how to interact with that system, how it folds into the manual review process, etc.
1
> but not all at the same time, so I had to deploy another 45 lambdas....
This would not be painful if it were serverless or SAM lambdas. I'd rather poop a brick than do this by hand though
1
> Drone OSS you can't use runners, you can only run jobs wherever the drone-server is running. However, you can run as many jobs as you want.
I'm confused by this. I have Drone running with two drone-runners running on my on-premise systems connected back to my drone-server.
1
> Find a solution for easy transfer and closing access to AWS/GCP keys, etc.
I'm assuming you are referring to the distribution of AWS keys. I use Hashicorp Vault to manage and distribute them dynamically. It has an "AWS Secrets Engine" that does this.
1
> found a way to resource bind them so an errant container doesn't starve all your processes of mem and CPU
This was my first thought, and that's a big reason why VMs are in play here for so long. Some of the code just uses all the CPU it has, or all the memory. And yet it still works and can be starved. They are being clamped and resources are allocated through the use of VMs. I suppose it's a separate topic to figure out how to possibly do resource allocation for a Docker container.
I am glad that I am understanding more already. It's helping me to know what path to take. Thank you!
1
> GitLab will be reduced to five users
I think that's just GitLab SaaS Free Tier. So if you are talking about Jenkins. Then you could also just go "Self Managed"/CE and be unaffected.
> But jenkins would at least be free
Depends on how much your time is worth.
Jenkins itself isn't hard to maintain, but since rather than have functionality in something that's auto-updated, the functionality is in extensions. It also doesn't have nearly as tight of integration into the Version Control. So you are flipping back and forth between systems.
1
> Google deciding to use LeetCode
When did that happen?
1
> Great now who is willing to teach a team of ops people who barely can do scripting that isn't willing to put in the time to learn?
Nobody, the era where you can be just an Ops guy is imo over.
1
> I also assume that this company is a consultancy type org, sending out their employees to multiple customer sites.
You'd be right about that. They do DevOps consulting for various companies. I didn't know this was a common thing, I don't know of many companies that bother to train their own staff these days. Most just complain about struggling to find competent employees while having ridiculous requirements for junior positions. This bootcamp and company seems like a good opportunity though. If you're familiar with this model, do you think it's a fair trade? Is there anything I should be aware of before committing?
As for learning of course it's important to always keep learning. I feel like I am coming prepared though as I already familiar with programming and run my own Linux home server with various Docker containers.
1
> I guess at the moment teams are creating binaries and then pushing them to some artifact repo, OR you are building servers manually?
There's a mixture of different things. Some code is just git pulled, local config created, and it's running. Some are tar'd up and just backed up then replaced and hope it works well like it did in the dev. Some are worked on directly in production, no dev or staging environment. No git there either, just a backed up production. I mean, it's a lot of work and each service needs a proper identification of it's issues and how I am going to lead them all in 1 direction.
I am thinking in my mind "code is code and deployment is deployment". My instincts say to make a repo like project1-deployment-dev and project1-deployment-prod, containing a docker-compose.yml or some Ansible files or Vagrant files to be able to deploy the environments and deploy from project1 repo. But this feels wrong, at least it doesn't feel very streamlined because then every project has different configs for different environments. I think the point of view with docker-compose I would imagine the best way to start that is to have the docker-compose.yml right with the deployable code itself. So the dev team can deploy a container, and then production can deploy the same container from the same repo or any commit they choose?
I am again, thinking in terms of "services run in VMs, therefore the VM with it's OS is the service". In this case a container would become the service and that container does not even need a host to be configured in a certain way to be able to run the deployment. It could be any linux distro, just have to have Docker installed and obviously has to have basic provisioning and networking?
1
> I mean, it's a lot of work and each service needs a proper identification of it's issues and how I am going to lead them all in 1 direction.
The ideal scenario is that the script for building the container sits inside the repository, and when commits are made to main branch, the deployment artifact that is produced is a docker container. This is as nice for them as it is you.
For some teams you might have to meet them halfway by taking their tarball/binary and automatically baking it into a docker container in your own deployment repo.
> So the dev team can deploy a container, and then production can deploy the same container from the same repo or any commit they choose?
Bingo - and then you never get "it works on my machine" again - the machine is always the same, because the code moves with the dependencies. The only exception being the OS and the processor ;)
> I am again, thinking in terms of "services run in VMs, therefore the VM with it's OS is the service". In this case a container would become the service and that container does not even need a host to be configured in a certain way to be able to run the deployment. It could be any linux distro, just have to have Docker installed and obviously has to have basic provisioning and networking?
Bingo - if you put things in containers you can move away from hypervisors entirely and just have your bare-metal hosts run docker containers. You might not want to do that right away unless you can be certain that you can run docker containers securely in this manner, and have found a way to resource bind them so an errant container doesn't starve all your processes of mem and CPU.
1
> I would always recommend immediately owning your mistakes, broadcast them to your team and grab as much help as you need to get it fixed. Be open and responsive to feedback and criticism, and be proactive in fixing documentation and processes where needed.
+1 to this. We all make mistakes. Owning your mistake through the remediation and resolution makes you the experienced engineer, rather than the trigger-happy-intern (who's never heard from again)..
1
> I'm happy I don't have to put public static void before each of my resource blocks in TF.
Wait for java 29 where they add more syntactic sugar and you will be able to write:
```
psv aws_instance(f var image):
build(image)
```
/s
1
> if you can afford it I would just go with a managed solution like Datadog that plumbs straight into Pagerduty
Don't have a budget for this I'm afraid. So I'm relying mostly on FOSS or free software, or in-house scripts to get the jobs done. Usually free self-hosted instances of things, such as Gitlab free tier, stuff like that. That's why I had my eyes on Prometheus and Grafana for metrics but not sure how to go about doing logging without running the full ELK.
Also when it comes to monitoring I quickly get confused. In terms of the stages of monitoring. Since I'm running on bare metal I have to monitor the hosts, then the VMs, and then the services in those VMs, and then the applications. Right now, the applications are monitored on their own. But the services like nginx or mysql are not. A lot of the time, something there will break, or something simple like out of disk space because one VM was forgotten for a long time. These things, basic health checks, is what I do manually. I check how fast the disk space is lowering manually by comparing to the previous checks and I make a forecast for when it needs extra space. Then I would just live resize the VMs data drive. But this is the sort of thing I should stop being concerned about, and basic monitoring will solve it I am sure.
> I'd focus next on CI - you want reproducible server builds
This is currently hard as well, because a lot of things are not even in git. Some are, but they require extra steps to get running. What I am trying to say is the deployment of software is not in git, so that deployment can't be deployed without actually sitting down and deploying the software and configuring it. For proper CI I need all the things in git and for deployment to be easy and straightforward right? I might be confusing CI with CD.
> Next I'd move to IaC
I see this helping in being able to bring up an entire dev environment with 1 command, or an entire production environment. And then not have to worry about manually configuring the entire thing, networking, security, etc. I do look forward to this one!
Thanks for your help!
1
> If you only know HCL and not a single programming language then that's a huge problem.
> No it's not.
Hard disagree. You absolutely need to know a language well, ideally something in common use like Python or Go. For starters, you will almost certainly need to write some bespoke software at some point. Also, you're going to get dragged into supporting devs, and if you can't understand their code, you're not going to be of any use.
Re: bash, it holds the internet together. You have to know bash, full stop.
1
> In TF providers can not use attributes of resources.
> An use case would be that you create a Kubernetes cluster and then deploy something on it. With TF you have to split this in two steps, but Pulumi you can deploy in one step and this can up.
You can definitely do this in terraform. I know because this is what I am doing.
1
> Including everything need to recreate all your pipelines and infrastructure (e.g. terraform).
That hadn't even occurred to me as a possibility. Thanks, will look into Terraform; looks seriously cool.
1
> Increasing cost (Cloud + ownership) with each change in pipeline
This does not follow. I've made plenty of pipeline changes without increasing cost.
> Drift in systems requiring constant changes on the pipeline (detecting also is difficult)
That's why there's systems for continuous reconciliation. I'ts one of the four fundamental principles of [GitOps](https://github.com/open-gitops/documents/blob/v0.1.0/PRINCIPLES.md).
> Versioning of IAAC + configuration code is difficult for self serve operations.
I presume you mean IaC? Anyway, learning how to version may be difficult, but so it learning how to script. In both cases you're frontloading some effort in order to significantly reduce effort later. Need to create a copy of your prod system for testing or DR purposes? Much easier to do IaC. Need to commit changes and test them in staging before applying them to prod? Again, much easier with *versioned* IaC.
If you are finding major drawbacks with CI/CD typically you should ask yourself three questions:
* **Are you using the right tools?** For example, are you using Terraform or Pulumi for IaC instead of scripting the cloud APIs directly? Does Jenkins make sense when Gitlab also gives you CI/CD (and without Jenkins' fragile plugin framework)?
* **Are you using the tools correctly?** How are your CI/CD components interacting? Why is there a manual intervention between CI and CD? What are/aren't you using that's causing drift?
* **Are you using the tools to their full extent?** As mentioned before, Gitlab has built in declarative CI/CD capabilities and integrations that in my opinion exceed those of the other tools you have. Are you forcing a workflow onto your CI/CD pipelines that is orthogonal to efficient automation?
1
> Isn’t the tag the reference of the build? So you’re saying I essentially cannot do this:
```
$ docker pull docker.io/image_name:1.0
$ docker push registry.gitlab.com/image_name:1.0
```
No. For that to work you have to retag before you push into another registry. Like:
```
$ docker pull docker.io/image_name:1.0
$ docker image tag docker.io/image_name:1.0 registry.gitlab.com/image_name:1.0
$ docker push registry.gitlab.com/image_name:1.0
```
1
> it can be quite hectic, add way more stress in the daily work life instead of choosing a different path in the company...
The same can be true for system administrator positions. This depends on the kind of team and company you work for.
> Considering that i dont like developing.
Do you not like writing code at all or do you not like because you are not good at it (yet)? I think if I were you I would try to give it a shot and get some training and if you don't like it you can always switch.
1
> It's just part of this myth that k8s is insanely hard to use.
Like most things, there's a kernel of truth here. Compare how many different moving parts (including third-party software like Etcd, Traefik, Istio, OPA Gatekeeper, Envoy, and other CNCF projects) are needed to run K8s at scale in production.
Then look at something like AWS Fargate, where you give Jeff Bezos a Docker image and never think about it again. I know the industry has settled on K8s, but I'd use ECS/Fargate every time when given the choice.
1
> maintaining documentation in a sensible structure
No such thing exists. What you need is quality *search*. Intranet search is somehow a harder problem than internet search. You might think fewer documents is a blessing, but it means you need 10x more structure and query understanding, while having only 1 percent of the link structure to work with.
1
> microservices
I pushed _hard_ for teams to build monoliths at my last job. Not _a_ monolith, because I knew that wouldn't fly, but like 6 monoliths instead of the 6276373839 poorly managed microservices we ended up with. Worse, our number 1 down time was caused by health checks that pinged other health checks round and round and round. Deployments became down time events because suddenly the API wasn't responding even though it was still in the k8s service.
And then folks didn't want to listen to me when I said there were better ways of ensuring you didn't make a bad call to a service. Because I had advocated for monoliths which meant "I didn't understand microservices" motherfucker do you think I was promoted to staff eng over you by not knowing???
> Terraform
Man, I love terraform but like 99.9999% of what we used it for a bash script would've been better suited and would've required less explanation. The only thing it made better was create IAM policy documents and that's just because I hate json on the cli (even then, bash script reading in a json file 💥 fixed but that's "too complicated" yet explaining DAGs isn't)
> secrets management
AAAAAAAAHHHHHHHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAH if I hear another dev say "we can _just_ standup vault" I'm gonna tell them it's their fucking problem and they can do it and manage it. Ever since I was exposed to that, I've seen _1_ use case where I went "yeah....okay" and that was generating per user encryption keys for extremely sensitive PII (ssn, bank creds, few others like that) but folks went _wild_ with it and started generating things like temporary postgres users in RDS and then looked at me weird when I showed them how much simpler (and far less error prone) rds iam auth was.
1
> most common HTTP requests/endpoints accessed in our app by the users
The oldest way to do this is by analyzing access logs. With metrics, you'd want to create a `counter` type metric for your endpoint handlers which increments a metric, including the endpoint name in either the namespace (e.g. `app.request.my_endpoint`) or in tags/metadata (e.g. `app.request` with tag `endpoint:my_endpoint`).
> most common HTTP requests/endpoints accessed in our app by the users
This should be a default metric on your load balancer, but you can also get it by creating a `gauge` metric which is incremented on new connections and decremented when they close.
1
> My end goal would be to use Kubernetes and understand that ecosystem. Do I start with that, just jump right in and fill in the gaps, solve problems as they come?
You could start with kubernetes in a managed cloud environment. You would
definitely need to know about docker first though to understand the concepts
involved and .
On bare metal
though you really should not run kubernetes if you have not automated the whole
process of spinning up and down new nodes. You would also need an automated
process for creating new certificates from your intranet pki for each
kubernetes host and possible end users of those.
The hosts need to be disposable without any condition attached.
If I were in your shoes in this kind of environment my first goal would be to aggregate logs,
metrics and monitoring of services, creating alerts. Figuring out what needs
immediate attention like making sure I have backups of everything important.
After that I would need to host git repositories to have
access to my versioned configurations, scripts, dockerfiles etc., add a
container registry for docker images and add a
CI/CD service for automating some workflows, building images, deployments to
replace existing cron jobs or watch for changes.
Then I would create a golden base image and write every new request with some
configuration management on the new hosts to either deploy some service or run
it in docker containers.
In the downtime I would try to automate everything necessary and migrate old
hosts and services.
I would only consider running kubernetes at that point if I decided that I want
to run every service on that cluster and when I can easily deploy
new kubernetes workers and have automated the whole process of building docker
images and getting updates in some downstream repository for anything the
cluster depends on.
Kubernetes is really too overkill otherwise if you are just going to run 2 services.
> I feel like even though tools like Ansible are cool, what's the point in writing Playbooks to configure my VMs for my deployments if I'm going to make Docker images out of them anyway?
You need it because you want your hosts to be disposable to be able to use
docker or even run kubernetes self managed. Your kubernetes hosts need to be
automatically configured and disposable. It is going to be pain if they are not
when you want to upgrade hosts, services, use another distribution or release,
etc.. It is a requirement if you want run multiple environments(dev, prod)
otherwise you are just repeating work.
I still need configuration management for automating setup of hypervisors which
run my vms and configuring docker hosts, kubernetes hosts, network services
which are not run in docker containers or storage servers.
Configuration management is required because it is less error prone, less repetitive, more
robust, more readable and documentative, easier to maintain, debug than some script you wrote
which is limited for some specific use case or something you did manually and
maybe forgot some steps or forgot to document the steps.
Certain tasks are abstracted so you can apply things you do repeatadly on
limited hosts with some specific input variables(your environment, service
configurations etc.)
> I should just write them as Dockerfiles and build the image? Or do I use Anisble to deploy the environment inside a Docker image, so the Dockerfile is provisioned using a Playbook?
No you use ansible to configure the hosts where docker or maybe in the future
kubernetes runs. You would for example configure the hosts and create services
which start a docker container from an existing docker image which was built
from a Dockerfile.
> If I'm going to use Kubernetes, why even bother with VMs?
Because if you are going to use kubernetes on bare metal you better run them in
vms on a robust hypervisor so you can in case of maintenance, emergency either easily
dispose, replace them or take snapshots.
> Either way, I need images to throw them at Kubernetes and at the very least have some Playbooks to deploy Kubernetes and setup machines as nodes.
If you are currently not even scripting to automate hosts setup I doubt
learning kubernetes right now should be your priority. You could definitely run
some services in docker to offload some workload though without knowing much scripting.
You should still check out some configuration language to automate some
processes you repeatedly do like snapshotting on hypervisor, deploying something,
maintenance tasks or installing agents or things you do in general on every
host every time. Or convert some process you have documented which takes a
long time to do manually into ansible.
1
> Once you have that done you can look at backing up apps configuration. Thats config files, cron jobs, custom files etc. Bonus point if it end up in a repo so you can track change the devs made on the server .
Not OP, but any recs on this? I have a bunch of embedded systems that run custom configuration files that change on the server, would be great to back them to a locally hosted repo and track changes, any self hosted solutions you can recommend? thanks.
1
> Prometheus and Grafana
Yes, this is something I want to implement. I just have not, but I have not found a way to properly handle logs. Something like logstash but without the entire ELK. I would like to see historical metrics in Grafana, pin-point problem times and then be able to directly look at logs without SSHing into any servers to start grepping for them. ELK seems overkill though. Any suggestions for FOSS/open source/free/self-hosted instance for logging would be appreciated.
1
> Real DBAOps
Maybe a source on 'real DBA Ops' would be nice.
I have worked in environment where we have dedicated DBAs and migrations prior to build/deployment pipeline. This makes migrations during the deployment impossible, but same goes for the app migrating the database on its own. The DBAs lose control and you might get in trouble. But in these environments the deployments are usually once every several weeks, if not less frequent.
For environments and organizations closer to 'devops' (what this reddit is partially about), DBAs will not be continuously making the schema changes as you deploy multiple new versions every week.
> what happens when your deployment fails because migration is incorrect
If the migration was incorrect, this will be caught in the integration environment. Some ORMs I have worked with have the option to check if the schema / model snapshot matches the database schema.
> You also lose on built in kubernetes rollback mechanisms.
This is what readiness probes are for. Fail the probe long enough and the application will get in a crash loop backoff. Deployment fails, old version is still running and you can roll back easily.
1
> rising costs
You were on the free plan and your company is complaining they have to pay for Enterprise DevOps platform.
Honestly GitLab Ultimate ($99/user/month) has a great ROI considering all the tools/functions it replaces (Jira, Jenkins, Bit Bucket, Artifactory, Verecode, XL-CD, etc).
1
> Should a person from non IT background learn devops?
Sure, give it a shot.
> How soon will he get the job?
General minimum of 2-5 years of experience in a more entry-level position (depending on how driven you are to learn and a bit of luck).
1
> So what about gitops tools like ArgoCD where git is the source of truth for deploying changes?
GitOps patterns update application *configuration*, not source code. AFAIK, there are no GitOps tools that write either code or build artifacts back to a repo - the repos in question should only contain items like k8s yamls.
1
> team of ops people who barely can do scripting
Uhh... What kind of ops people can't write scripts? That's the main point of the job.
> Another advantage with DSL is its written cleanly.
You say that until you've seen heavily modularized Terraform. It can get really confusing really quickly.
1
> Terraform: Unless you’ve already got a guy who can write & maintain it, don’t go down the rabbit hole of declarative infra mgmt. See Spotify’s examples of “woops, we nuked our DNS when we were still learning how to do this process right” to see what would otherwise consume your time even if you had a devops-only team (which they do).
Hard disagree on this one, terraform is super simple to use if done early and will save a ton of headache if you grow bigger than a team of 3 people.
1
> Terraform. I prefer declarative
Amazing fake quote, that's not what they said at all
1
> Thank you! But in a practical way how i can work on them? For example i should tell my manager that i want to focus on Prometheus server administations , maybe also Ansible if it is on some projects etc?
Maybe take the opposite approach and ask them what role they expect you to fulfil in your team and what skillset they want you to develop. This should give you a better idea if the job is really a fit and give you a better idea what you should ask of them (more training, transfer to a different team or be given different responsibilities in your current team).
1
> That's orchestration for you. Terraform can't do that.
Thats exactly the point - it silves a different use-case, why is that so hard to understand ?
1
> the search engine isn't the main issue. People like to blame Atlassian
My point is a bit more nuanced than "Atlassian needs to fix their search." Rather it's that we need to add signals search engines can use to provide a quality search. "document structure" typically means the most dedicated (ie most pedantic) moves documents into a bunch of folders that effectively hide things 30 layers deep.
What we need are more signals in the data ie the docs:
- more self links to assist in page ranking
- more metadata so my searches stop ranking meeting notes for a project more highly than architecture diagrams etc.
- a short pier for some bureaucratic antibodies to take a long walk off for demanding access controls to documentation and simultaneously outlawing RBAC
- tags not folders. the purpose of a folder is basically encapsulation and information hiding, and there are so many edge cases where you the author has to make a judgement call about which of two locations make sense for a document. with tags the coordination problem between author and reader is reduced
- more data on search quality (first result clicks, bounce rate, etc) so that any effort to fix any of this can be validated
1
> The Unicorn Project": they implement the change and
> prove
> to a management who is often trying to get them fired that this is a superior way to operate.
The experience of my life paints a different picture. Even in this company which is a fortune100 has a top to bottom way of changing instead of bottom to top.
If you can do it. Yay for thee.
1
> Unless you’ve already got a guy who can write & maintain it, don’t go down the rabbit hole of declarative infra mgmt
That's the wrong approach right there. There shouldn't be 'a guy' who can maintain it - everyone deploying infrastructure should be able to write terraform and if that's not the case - bring everyone up to a common toolset and skill level. This is devops, not "I'm a dev who runs an op's bash scripts"
1
> Vagrant would be another option
I used Vagrant as part of the KodeKloud courses a little bit. I setup a VM and did a little work with Ansible there to configure another VM. What I need is access to a large box to setup and test different node configurations. Work my way up, build Playbooks to be able to turn a bare metal box (or cloud) into a full deployment for testing purposes. ESX, Proxmox, whatever I can get access to.
> Also, I haven’t seen you mention git at all. If you aren’t using git, make sure you start.
I'm using git but likely not in the way I should be. I'm using git for some tools and scripts I wrote to better manage some things, but those tools and scripts require a config file on the local machine. And those config files are not backed up or under any version control. So that means while the scripts are in git the deployment details are not re-deployable. I would say that nothing is re-deployable using git right now, and that's what I think you mean?
1
> We didn't want Jira in the first place but apparently using Azure DevOps boards wasn't good enough because they couldn't get metrics/tshirt sizes and other garbage that make business people spreadsheets happy.
[Are you me](https://giphy.com/clips/collin-QMEkDP3yiIX5SDVG38)? One other - semi-legitimate - complaint that stakeholders had was that AzDO offers no service desk. I said, "that's the point!" but they didn't care :(
1
> What are some use cases that you see where Fargate would simply not be enough, and you need something like K8s?
Some cases where Fargate is not ideal are things like long-lived connections on edges, or systems which are scaled up enough to be cost-prohibitive on Fargate. For both of these, I'd use ECS-EC2 instead of K8s.
I think K8s is really for when you're Google, and you have millions of bare metal machines with different configurations. Or maybe when your team is providing compute resources to dozen of partner teams.
Someone with a more K8s-positive view may be able to elaborate more here.
1
> You can secure your secrets easily
I need to look at how to proper store secrets with Ansible, since this is something I am worried about.
1
> You need to stop slicing and dicing statements to suit your narrative.
This is the only sentence in your arrogant strawman filled dribble that actually applies here, but you really need to be telling it to yourself.
As for the rest, I appreciate you so aptly identifying yourself as someone I absolutely should not bother engaging with. All the best.
1
> your own work easier with automating repeatitive tasks,
WFH? Get a Roomba type vacuum and expense it if you can.
1
>Akeyless
Looking at them now, and they are compelling. It's a weird balance though:
\- If you are very paranoid about security and want dynamic secrets, automatic rotations, multi-cloud support, and Akeyless supports all this.
\- However they are SaaS only.
Sounds like it's a good fit for small to midsize companies in an early phase, that don't want to get distracted hosting their own Vault, and are willing to eat the risk of Akeyless getting popped. Seems interesting. I've reached out to get a demo sorted out.
1
>Also regarding Kubernetes, the "correct" way of deploying a backend and frontend would be:
\- Backend in the private subnet and ingress in the public subnet
\- Frontend in the public subnet alongside ingress?
In terms of Kubernetes clusters, it's typical to have your worker nodes in private subnets and your Load Balancers/Ingress in public subnets.
Also, at least for AWS it's common to use tags on private/public subnets so that your LB/Ingress controllers know where to create its resources to expose your services/pods outside of the k8s cluster.
* Public subnets - LB/Ingress that you intend to expose publicly
* Private subnets - Worker nodes + LB/Ingress that you intend to expose privately within your VPC
1
>and not as the step before getting a job as a devops engineer.
What if the company claims to hire most of the people who finish their bootcamp? That's the case in the company I'm going to, 90% of their employees were trained in the company's bootcamp and are doing real DevOps work.
1
>ardware to manage your max needs.
That definitely makes sense. If you have an existing Kubernetes cluster, I recommend doing a POC with "Jenkins X" (Not traditional regular Jenkins).
It will allow you the freedom to run multiple parallel runs without any issues if you do the capacity planning right.
1
>Basically you learn things as you go and focus on fast feedback loops between wanting to do something, looking it up and then doing it.
I guess this is where most of the Udemy courses don't excel - it kind of makes the learner bored. Do you think the same? :)
1
>Best way to learn kubernetes?
imho is not to start with rancher.
1. Start with the official kubernetes docs
2. minikube
3. Rent a vanilla kubernetes multinode cluster (because it's not practical to experience draining a node otherwise and other things)
Knowing how to debug kubernetes imho is very useful, and if you skipped step one it would be good to go back to it (you may not have, but all you said was experimenting with rancher). Depends how managed you want to go.
That said, as an industry we shouldn't aspire to 'manage kubernetes' but deliver value (But at What Cost..) and use systems/tools/vendors which help make kubernetes boring. When kubernetes is boring you have won.
Other multi-node projects include k3s (by rancher), microk8s.
1
>bitbucket
Oh no, bitbucket. I dont know a single person who was/is happy with the atlassian CI offerings.
1
>Broke IAM permissions to S3 bucket so AWS had to reset them to default
How tf does this happen?
1
>Decisions of the above-mentioned things can be the make or break of your business.
Counter-point: the only example you gave was Spotify, and they are a hugely successful business in spite of falling for the very trap you warn against?
In a start-up environment, as long the tool you are cargo culting isn't impacting your ability to get something out faster.
I've been a part of the technical due diligence for companies that some of things you're advocating against, and guess what? Some of those specific technologies not only helped them to grow their business (microservice approach for a company that was a huge integrator), but their use of industry standard tech (and yes, k8s and terraform industry standard) directly helped them with their company's exit and acquisition.
Your companies may have failed but I'm guessing they had nothing to do with standard-if-not-the-correct-for-our-use-case technologies.
1
>Deep inside accepting it "just fails" violates some axiom about how computers works. This is knowable and damn-it I shall know.
I feel this shit on a deep level.
1
>Doesn’t Solution v2 result in every invocation being a cold start?
Yes, however the 100s of milliseconds wasn't an issue in our use case.
>If so, is that actually better than a retry-able periodic failure at scale?
The `self-immolation` solution has the advantage to being robust to other memory/tmp leaks. We discussed using a random process to decide if setting a env var was required.
Your suggestion or an improved retry system would have also probably worked well. Understanding exactly what was retry-able wasn't as quick/easy for us in this case with images/pdfs.
1
>Even if the IaC is written in C and you don't know C you can extrapolate and read the state and start writting IaC with the language you know.
I sure can. And now you've pigeon-holed your infrastructure into a language that just a tiny fraction of DevOps Engineers are familiar with. Good luck finding that unicorn engineer that you desperately need. If I or anyone in my circle was being poached by a company using Pulumi and C# to build infra, everyone of us would nope out on that. It's not the standard in DevOps. The end.
>If you only know HCL and not a single programming language then that's a huge problem.
No it's not. Every DevOps Engineer understands Terraform, CloudFormation or both. This is what the industry expects today. Can I use Python, Bash, etc.. yes but I also can't create an ETL tool if I wanted to.
1
>ever found a good reason to use it over Python. Our Go lambdas don't run much faster than our Python ones, doesn't really seem worth the extra engineering effort.
YAML Ain't Markup Language
1
>Example would be rather than trying to coordinate the change of systems in a parallel manner with manual stops using CI runners, write a script that offloads that to e.g. a k8s cluster which is actually designed for that use case. Your CI system can simply block the job and poll if you want to wait on status completion, or your k8s tasks could include one that monitors its own jobs and reports back when it's done. The CI job could even just return a k8s invocation/job ID or similar, allowing the CI job to quickly exit, and then an operator could query back the status of the job ID as needed (if you wanted to minimize CI runner cost).
What you are describing here sounds really interesting. I have not been able to use k8s, but I want to get into it. It never occurred to me that you can basically also use it to run tasks from a deployment cycle on it. Pretty cool, do you have some resources for me where a system like this is described?
1
>Failing to recognize where you're at in the present of your company growth results in premature optimization, which is a tuning flaw in itself.
Can you expand upon this? Why is it a "tuning flaw"?
1
>get help.
[I hate get help.](https://www.youtube.com/watch?v=OGqqLo0QWoM)
1
>Here’s a guide I made for it if you are into DIY.
Thank you! Do you have any recommendations for pre-built ones?
1
>I am considering an ALB with host-based routing, but this seems like a monkeypatch. (Using a load balancer for what is primarily a routing task.) I would have to create a "target group" of 1 instance per each subdomain.
Why not ? This works great and I used it for years. ALB has great integration for EC2, ECS and ACM. You don't need to manage extra infrastructure. It can handle SSL termination for you.
1
>I currently work for a construction company that wants to transform their use of Azure and get the maximum out of it. They do not exactly have a dedicated DevOps team and are looking to change it.
That is not DevOps. That is cloud engineering team or platform engineering team
1
>I did not realize that being able to even just build an <insert thing here> was considered THAT hard.
The more I learn, it all just kind of starts to snowball. First it was a static site on S3 using pre-generated content. Then it was an API to return information about my AWS usage. Now I'm looking at building simple web apps. Every new <insert thing here> starts out as "ok, how do I even..." to "oh. That wasn't so hard."
1
>I don’t know what your problem is
Your "advice" is to dismiss things you don't like and arrive at conclusions that do not align with reality. You're actively doing harm to someone who listens to your nonsense. This is contemptible. Do better.
> I was completely fact-based in my opinion
I'm going to be very generous and assume you genuinely think you are. The only fact you've listed to back your opinions up - not your personal experiences, not your sweeping conclusions, or baseless assumptions, but facts:
* faang is big and you likely aren't
* spotify had trouble this one time
Your personal experience is not a fact. It is what you experience.
1
>I know the industry has settled on K8s, but I'd use ECS/Fargate every time when given the choice.
Interesting take. We're just starting to move into Fargate. What are some use cases that you see where Fargate would simply not be enough, and you need something like K8s?
I also kind of get the sense that the industry has this love of K8s, and so everything must be done in K8s because it's K8s, rather than evaluating the use case and determining the true best solution.
1
>I sure can. And now you've pigeon-holed your infrastructure into a language that just a tiny fraction of DevOps Engineers are familiar with. Good luck finding that unicorn engineer that you desperately need. If I or anyone in my circle was being poached by a company using Pulumi and C# to build infra, everyone of us would nope out on that. It's not the standard in DevOps. The end.
How can you say pigeon-holed infra just because they are using a turing complete programming language?. That does not really make sense. Why do you think Hashicorp is accelerating the development of their CDK?.
We are already in a point where we are treating everything as code (infra, services, config, etc) and whenever I get the choose to pick up a programming language over a DSL I dont even blink on my decision. Any turing complete programming language will be better 99% of the time than a DSL, can not even compare.
No it's not. Every DevOps Engineer understands Terraform, CloudFormation or both. This is what the industry expects today. Can I use Python, Bash, etc.. yes but I also can't create an ETL tool if I wanted to
This is sad to read from someone that is a Principal Devops Engineer and speaks tons about pretty much why DevOps engineering is what it is now. No comments.
1
>I think I need a week to 2 weeks to migrate
Probably not true, and 2 weeks of work is worth a lot more than the 2k you spend per year.
I would steer away from a company that does not understand how paying for quality tools saves money in the long run. Some companies just don't understand the value of the time of their employees.
1
>I was asked how quickly I am wanting to get out of my current job(that I do not plan on leaving but dropping down hours significantly and branching out) and if I would like them to expedite the hiring process.
Not so much of a red flag.
>I asked if the previous employee abruptly quit on them and was told yes, along with "members of other teams have been handling it since they left" and the DevOps role is a one man band there.
Yikes. You'll have no support at all? What happens if you get hit by a bus? They just rinse/repeat this process while everything grinds to a halt? What if you get sick? Want to go on vacation? Have a death in the family? Red flag #1.
>I was asked about salary and when I gave my expectations, which I think are pretty low as I am trying to gain knowledge outside of a position that gives me no freedom to change things, I was told "well we have a specific budget and we have not comp'd this role in other states yet" they have two offices in CA and I live in CO.
**Bigger** yikes. Welcome to post-COVID job market. Your skill set should command a salary commensurate with what you can do, not your physical location. It doesn't matter if you're in CO, or SF, or NYC. You gave them your range. If they can't meet it, or try to lowball you, fuck them six ways to sunday. Know what you're worth, king. (Or queen). Red flag #'s 2, 3, and 4. If they're going to skimp on you, what else are they skimping on infrastructure wise?
1
>If you are in tf version hell, which can be a real thing in a big org, you should look for different tools to help with that.
Such as docker.
1
>If you haven't broken prod at least once in your career can you really be a devops/sysops/developer? :D
It's essentially a rite of passage at this point.
Good judgment comes from experience, and experience comes from bad judgment.
1
>If you want change in your company, you need to convince management and in most times upper management to force middle management...
You absolutely don't need to convince management, at least not in the "ask permission to make the change sense".
And they don't in the book, *particularly* in its follow-up "The Unicorn Project": they implement the change and *prove* to a management who is often trying to get them fired that this is a superior way to operate.
That is how you change things. It's a sad and non-innovative engineer who asks management to make changes instead of proving to management that these changes work. And before you dismiss this as unrealistic - I have done this, whilst having my job threatened, in multi-billion dollar companies with the worst engineering practices you can imagine.
It absolutely is 100% doable even when there's non-technical, volatile micromanagers and a "blame game" culture, exactly as depicted in the book. It's doable in far more conditions than are found in the book, as long as the engineers are prepared to put their jobs on the line disregarding the organisational hierarchy.
Neither book pretends your job isn't on the line when you do these things and that is accurate - your job always goes on the line when you implement that kind of change, but it absolutely does work and when there's an obviously better approach to technology management does shift.
1
>In other words, how to be sure that I won't forget to remove "AWS IAM
roles which you need to remove", since people like you and me usually
have a lot of other things on a plate?
Automate everything. If not possible, document it clearly and try to make it impossible to skip important steps. E.g. if you want to make sure people badge in, do not open the entrance door until they do. If you want to reliably disable accounts, provide a single method (script) which will do just that and it's 100% sure it worked when the script says "It's done".
As for long term security credentials: simply never give *any* out. Hashicorp's Vault or AWS KMS or similar do a good job of dynamic credentials management. OAuth2 or OpenID makes sure that people are who they claim (authentication). So use this to give them permissions (via roles).
Lastly give everyone the roles they need, but not more access than they require. Don't do shortcuts (e.g. admin access to prod DB).
You will still have the problem of disgruntled employees who simply copy the code, or modify it in malicious ways etc., but this is not so much a technical issue. Treat everyone fair, and you'll have less of those problems
1
>it also helps to visualize metrics quite well.
Second you start doing Metrics, the developers are going to lose their minds.
1
>It will be difficult to say without knowing the exact shape of your data, rate of cardinality growth, and how many un-indexed fields you will require, but my org currently uses on-prem InfluxDB Enterprise. Our total cardinality is less than 50 million unique series, and our annual license cost is around $50k/yr however compute and storage is another \~$20k.
Our cardinality will increase 10% year on year. I have talked to them, and the estimates that I have received are in multiples of your bill.
1
>Its a poorly written fictional take on organizational change.
This is the part I can't get over.
Why are folks trying to mimic fictional organizational change?
1
>Knowing better how things are deployed and managed at runtime helps you better design things that are easier to deploy and manage if you decide you want to go back to programming again.
Very good point. Too many engineers who don't have the "big picture" of how everything works together.
1
>Language
Thats why he said "Makeup language"
1
>more of just a "go to this link and play around with it
That can end up pretty costly at AWS :D
1
>No matter how y’all try to push Pulumi, cdk or any other IAC too promoting imperative languages, Terraform is still the default.
That doesn't make it the better solution.
>There’s power in simplicity and guardrails.
Its simplicity comes at the cost of operational toil. Terraform doesn't provide any guardrails out of the box. You can use (paid feature) Sentinel with cdktf as well without changing anything in your terraform workflow :)
(There are also alternatives you can use with Pulumi as well)
1
>Noted, thanks!
You're welcome!
1
>Pay the OOS creator to fix it for you ?
Fair point. It didn't occur to me (the team) at the time. Also I wonder if it would have solved the problem on the timeline that we needed it.
Probably a different thread to discuss the philosophy of how much of an OSS project's roadmap should be driven by commercial concerns vs community interest.
1
>Porcodio
Why you gotta be so mad? lol
Yaml is *technically* a programming language, just slightly different than what you're used to since it's specifically a data serialization language.
1
>Project Status (2021-12-11): No longer actively maintained
Uh oh...
1
>r than trying to coordinate the change of systems in a parallel manner with manual stops using CI runners, write a script that offloads that to e.g. a k8s cluster which is actually designed for that use case. Your CI system can simply block the job and poll if you want to wait on status completion, or your k8s tasks could include one that monitors its own jobs and reports back when it's done. The CI job could even just return a
lol.. JenkinsX is the answer you are looking for. It runs on K8s and allows you to reuse Gitlab build environment images with minimum to no changes. Just redirect it to the same repository where you moved your images from Gitlab.
1
>So you would rather install docker, mount your folder/fs into it
Yeah that's about all it takes, which is a lot easier, more consistent, and preferred by the development teams I support.
The Dockerfile is your version pinning and it also provides a way to bundle custom extensions, dependencies, providers, or what have you without requiring them to be pre-installed across your dev, CI, and CD environments -- all of which should be using the same thing to execute.
For example, when a dev executes Terraform for their sandbox, they have the option of executing through CI/CD or locally. With a dockerized Terraform container we can ensure each of those systems are executing the exact same way.
1
>t’s hard because I don’t like to b
Likewise, I'm glad to see I'm not the only one who avoids car finance - That being said, £39k for an M135i... I paid £33k for my M2 in 2020 (2017 model), M4 Competitions can be had even Used Approved for less than that!
1
>Terraform is an orchestration in itself.
How? :)
>Thanks to you ppl like me have a premium job.
You will end up unemployed one day, if you keep clinging on the same tool :)
1
>terraform\_remote\_state
does not handle dependency resolution as part of a workflow. Its on you to do that.
1
>Thank you
I got you, bruh :)
1
>Thats exactly the point - it silves a different use-case, why is that so hard to understand ?
Its not a different use case. Everyone needs orchestration if using terraform, one way or another.
1
>This is going to be a very big change for the entire team since most of our developers do not respond very well to change
Trust me they do - a developer's entire life is change.
When people think developers aren't responding to change, the change is usually not working, and people who aren't engineers tend to model things "not working" as interpersonal or political problems, when they're almost always practical.
Developers almost universally adopt an agile approach when left on their own. Organisations who "move to agile" instead of simply *becoming* agile through good practice are almost always making a very mockery of the term. Be cautious about falling into this trap - you can head off that problem by assuming that your devs have a reason to resist the approach that isn't "they don't like change".
My suggestion on tools is this - don't start with any, or if your team is remote start with nothing more complex than Trello. In fact, consider not starting with scrum at-all: remember, in agile *the team owns the process*. If you are going to tell them to work in scrum and they haven't agreed to that, you're already not doing agile because you're telling them how to work instead of letting them determine how best to work.
This has the stink of "Corporate (fr)Agile" all over it.
1
>This is going to be a very big change for the entire team since most of our developers do not respond very well to change.
Get a lot of good, solid training so you’ll know what you’re doing. Some people use a “cookbook” approach for Waterfall that is somewhat mechanical. They typically use standard, well-defined phases with clearly-defined documentation deliverables to complete each phase. They might even use fill-in-the-blanks document templates.
There’s no “cookbook” approach for Agile Project Management and it can require a lot more skill and judgment to fit the approach to the nature of the project.
1
>What am I missing?
You're mostly just missing the notion of needing to bake in oauth2 or some such mechanism for auth. Maybe SSM since it's all on AWS? To that end, this is also an *excellent* opportunity to create IAM roles for least-priv access.
1
>What will you use to invoke the Lambda? You'd probably need to add extra bits like Kinesis/SQS to get the Lambda to invoke when CF+S3 receive a request
There is lambda@edge, if the application is simple enough :)
1
>When it comes to Docker, I understand it's similar to a VM just not a full OS is running, only a single process. I can understand that I need to turn my deployments into Docker images for any chance at a good devops workflow, but that is proving tough too. For example one service is a PHP webapp which has a couple of cronjobs setup. The app has a frontend, an API, and connects to redis and memcache and mysql. It's also a monolith. I'm not even sure where to start with making this high availability in Kubernetes for example. Because I'd need to deploy it 3 times, if I'm right. But if I just deployed it as it is the cron jobs would be running 3 times, and they are all designed to only run 1 at a time and no duplicates. I'm not sure what the right thing is to do, and not sure where to gain the knowledge except with trial and error.
Everything you described here can be moved into kubernetes pods.
The cron thing running only in a single instance can by managed by using StatefulSets and init-container. The init container will mark the 1st pod spawned(extracted via a script from the pod id) as the "cron runner" and write it into a file that will be used by all 3 pods(could be as simple as using an emptyDir volume mount)
This is an example leveraging this method to setup master-slave redis:
[https://www.containiq.com/post/deploy-redis-cluster-on-kubernetes](https://www.containiq.com/post/deploy-redis-cluster-on-kubernetes)
> feel like even though tools like Ansible are cool, what's the point in writing Playbooks to configure my VMs for my deployments if I'm going to make Docker images out of them anyway? I should just write them as Dockerfiles and build the image? Or do I use Anisble to deploy the environment inside a Docker image, so the Dockerfile is provisioned using a Playbook? I
It really depends on your scale. Any of these tools can really tackle deployments. You could right simple bash/python/powershell scripts to do it. You could run an ansible playbook to do it. Heck you can even use terraform with modules/script providers to do it.
For me deploying new services is not ansibles work it self. It can be the tool used behind the curtains but not itself. First you need a deployment management tool. Something like jenkins, bamboo, argo etc. Build some well defined steps and then use any script/tool you need to run your deployment. But an almighty ansible playbook for me is not a good approach. It's better to have multiple playbooks that take care of steps and have a build plan/deployment job/pipeline call the seperate tasks/plays you need complimented by some logic, scripts etc.
For example we are making bare metal deployments(vm services/dockers) via bamboo deployments that call seperate scripts and ansible plays.
Also avoid adding scripts/code/logic that can't be run manually inside what was described. Everything that a pipeline/deployment etc does automatically should be able to be run manually at any time by you. It's a common pitfall.
For kubenetes we are using a git repo that has configurations for helmfile which then downloads the repo, logins to the cluster and run the helfile deployment configured by files found in the repo downloaded(configs related on which apps to deploy, how to login, where images are found, helm values for each app for each project etc).
1
>Why don't you parameterized the script and add it to all branches?
The only reason I didn't add the script to every branch is because we have a branch strategy (Dev --> Test --> Release) that I didn't want to bypass because I'm a junior developer and wanted the safe-feeling route. In retrospect I should have bypassed it regardless. Thankfully the script will \*eventually\* be in every branch, just not right now.
1
>Yea, though I don’t know who isn’t using k8s nowadays.
The majority of companies that develop software.
1
>Yes i hope i can find a work suited to this version of "DevOps" that you described
I can say this with certainly - it's most companies. It's far, far more common for a "DevOps" team to be working with the developers (hopefully very closely) than it is for them to be actually writing the application logic.
1
>You dont need terragrunt to accomplish the same, simple symbolic link to multiple backend.tf will do the same job.
If you have 5 modules that have a different backend configuration, no symlink in the world can help you. The fact that it cant be templated is what will force you to maintain 5x different [backend.tf](https://backend.tf) files.
&#x200B;
>With terragrunt you cannot push to multiple state files also.
You can template the [backend.tf](https://backend.tf) for each module, declare dependencies in your terragrunt.hcl and then run a simple terragrunt run-all plan and have terragrunt plan everything in sequence. That's orchestration for you. Terraform can't do that.
1
>Zexanima#0329
Would love to. I just sent you a friend request. You're going to be shocked how dumb I am.
1
#1 - traffic with allowed destinations will leave the service mesh from the egress gateway with less restrictive instance security group
#2 - I remember there was some ports to unblock in the 15000 range with the istio nodegroup (see the sidecar logs)
1
`Clarifying questions`
Apart from ArgoCD, what do you use for your CI/CD? Jenkins, Bamboo??
1
+1
1
+1 for ClickHouse
1
+1 for dev containers. Also work with other IDEs.
1
+1 for Victoria, it’s like Prometheus but more awesome.
1
+1 This is how I do it too, highly recommended for subjects to study and motivation. Nothing motivates you to keep learning like seeing that you're about 85% of the way there to a 50k raise.
1
~10 years ago: During a DR exercise I didn’t double check a setting that was known to be “sticky.” (3rd party app that would *sometimes* write the current config out during shutdown, overwriting the new one.)
Subsequent changes were made to DR and Prod. MQ configuration in Prod ended up pointed to DR. When things were processed in Prod after the DR exercise they were sent to the dead DR queue. Error messages were generated from the application about delivery failures, but lack of active monitoring resulted in a 48 hour delay before anyone noticed.
Employer missed federally mandated timelines (MQ was a trigger for physical document delivery) and paid a little under a hundred thousand in fines. It took manual processing with four people a week to straighten out, so tag on their salaries to the cost.
Surprisingly no real impact to me. The configuration issue was known and was supposed to have been fixed. The monitoring failure was a larger issue for them - turns out we weren’t *actively* monitoring a lot of things management assumed we were.
Vague because the incident made our quarterly report.
~23 years ago: Working in dev database. Get interrupted with a request for an “ad-hoc” report - the kind they want right then, while they’re on the phone. CEO, what do you do?
Pull up their data from Prod, give it to them, go back to working. Forgot to switch back to Dev, just highlighted my test SQL and reran it.
No where clause on the SQL, still in Prod, removed every record from the table.
No problem, we have backups, right? Oh, no, we don’t - our SQL database had outgrown our backup capacity and upper management had declined the request for an upgraded DLT drive.
Ended up pcAnywhering into each of our offices (~40) and copying their local database (Symantec Q&A, go go DOS - yes, in 2000) over dialup (14.4k), dumping them to CSV then importing them into Microsoft SQL Server. Corporate items had to be manually entered.
Again saved by someone else’s mistake. A fuss had been made repeatedly over us not having a backup. Management acknowledged their role in the breakdown.
Meh.
1
👌🏽 What they said. Cool article
1
1. Even if I use VPC endpoints, how to allow SQS Ips in security group. There are no fixed ips. I understand that the traffic will flow using the VPC but the enpoint IP will still needs to be there in Security Group.
2. I have tried SecurityGroupPolicy but Istio sidecar is creating some issues and pods are not able to comeup.
1
1. For the cost, we recently Integrated a tool for vulnerability detection which cost us, subsequently we needed to train our team with how to remediate vulnerabilities reported by the tool. The same goes we any tool we introduce.
For ownership cost, each time we make a change say updating the instance type the information is recorded in jira but documentation begins to rot since it's not updated frequently.
2. For drift in system, we have multiple setup for multiple customers but minute changes according to tools used by the customers. (For eg, AD , DNS). We have variables in our automation to support these customisation but if drift occurs in one system it is hard to reconcile without affecting other parts of automation.
3. The biggest problem we have seen is that we integrate our code once but deployment is always on demand. (for eg New deployments for multiple customer POC) With Jenkins it's easy to deploy but harder to maintain the metadata. We can store in DB but metadata tends to rot. (Eg tracking IP address of auto scaled instances in one place - we have to go to console to check for )
What we have realised is CICD as a whole works great where there is synchronisation between end systems. But with systems with one product multiple flavors of deployment requires tweak with CICD process ( Biggest mistake for us to go that route but now extremely difficult without pissing off all the customers)
1
1. Vanilla AWS CNI does not support network Policies. Need to deploy some 3rd party plugin like Calico or Cilium.
2. Its bit tricky to put fqdn in network policies. I need to restrict access to internet but allow access to aprticular FQDNs, e.g. AWS services like S3 and SQS.
1
1. Vulnerability scanning needs to be done with or without CI/CD. It is a known cost, either as a fixed monthly fee or as a metered fee. It's an operational expense, and not something that changes with each pipeline update. Same goes for other tools you introduce - you budget for them, and then you use them. Furthermore integrating those tools into CI/CD reduces the time spent running the scans manually. Unless you either a) have a terrible CI/CD setup or b) don't run the scans properly when doing it manually, then you'll find that CI/CD actually saves you money by making the vulnerability scanning an automatic part of the workflow.
2. The first part is configuration management, not CI/CD. If you're putting the configuration management logic into CI/CD then you are not using it correctly - it can be **triggered** by CI/CD, but should be **delegated** to the configuration management tools (Ansible or similar).
3. You are trying to use CI/CD as your source of truth. That is not going to work, again it's not the function of the CI/CD system. Jenkins should read the metadata, not maintain it.
> What we have realised is CICD as a whole works great where there is synchronisation between end systems. But with systems with one product multiple flavors of deployment requires tweak with CICD process
Again, you are mistaking CI/CD for configuration management. The configuration logic should be stored outside of your CI/CD system. Multitenant systems in particular benefit from the GitOps approach, too.
Consider the following setup:
* CI deals with **building** artefacts - vuln scanning, linting, compiling, integration and unit testing, and packaging.
* CD deals with **publishing** artefacts - once a build is approved and merged, CD makes it **available**. It does *not* push the artefacts to systems.
* Infrastructure as Code provisions and updates the infrastructure, up to the point of installing and bootstrapping the configuration management agent. The bootstrap aspect includes ensuring that it has access to artefacts, secrets, and knows what configuration it needs to apply. You can eventually run IaC via CI/CD pipelines, but it's not a requirement. There's standalone IaC management systems too.
* The agent then performs continuous reconciliation (CR). It periodically pulls the latest version of its assigned declarative configuration, and ensures that it's current state matches it's desired state.
Now if you want to update a customer system you just update the desired state for said system. More than one system can follow the same desired state, which makes setting up test environments easier. Each customer has a separate desired configuration directory, so you minimize the chances of conflicting changes.
1
100%
1
100% agree. I don’t regret reading the Phoenix project. I found it entertaining but it tries to paint it’s ideals as the right thing to do to have a productive software department. Of course all these ideals are going to work when you have a fictional world set up to allow them to.
Also 100% agree with Accelerate, too. So much of measuring productivity in software engineering feels like a soft science but this book did an excellent job putting data behind it. The results weren’t anything glamorous but provide a useful tool to convince management and non technical people that achieving the goals in the book are important to the health of the company. A main goal, IIRC, was reducing cycle time. I use that as a metric all the time for trying to improve our software delivery process.
1
15 years ago.You truly did learn your lesson
1
2k a year? You work in a shitty company if that low of a cost is a concern
1
2k per year is really nothing. To put it into perspective, using fairly normal industry rates, $2k would buy one engineer's time for like a week. If it takes more than one week (which it will) to migrate everything over, including setting up pipelines in a new tool that sound at least somewhat complex in logic, you're already in the hole for that year on the swap.
This is _so_ not worth doing it's not even funny. But good luck!
1
2k/yr doesn’t sound too bad, considering it holds your companies assets. Also keep in mind, Jenkins also needs the on-pre resources to run, at my job we probably have 50-100k€ if not a lot more in hardware behind this instance. So while the software is free, the cpu required will still be your expense.
1
2nd on using Aws Secrets manager. It’s very simple if you already have IAM figured out . Consul is another option.
1
A big difference is how providers are handled.
In TF providers can not use attributes of resources. In Pulumi on the other hand, providers and normal resources are handled identical in the dependency tree.
An use case would be that you create a Kubernetes cluster and then deploy something on it.
With TF you have to split this in two steps, but Pulumi you can deploy in one step and this can up.
Eg. I want to set up an IdP service using Keycloak.
Kubernetes --> Keycloak --> Setup IdP on Keycloak.
3 steps with Terraform
1 with Pulumi
1
A build environment is something else and that's the only place where containerizing these tools makes sense. And even then you'd not want all tools clamped together in a single image, you'd want to separate them so you can have a better lifecycle control.
If you are in tf version hell, which can be a real thing in a big org, you should look for different tools to help with that.
1
A configuration change I rolled out to our Linux machines had an error in it that wasn't caught until machines updated packages on a patch Tuesday causing hard outages across the entire F10 enterprise, including the corporate web presence. The Ansible module I used had an error in it it turns out that passed the configuration file that shouldn't have validated. That Tuesday happened to land on the day of a shareholder meeting. I managed to get it fixed about 15 minutes before the meeting started but when I woke up to a page and dozens of people paging with the highest severity possible all saying that DNS was down I took a look at the errors happening in our logs and realized the error and rolled it out to the fleet within about 5 minutes.
Due to the potential audience of the meeting I could have been almost directly responsible for share price drops in the middle of the trading day and singlehandedly dropping the Dow Jones a noticeable point or two.
1
A cookbook
1
A couple of ways
\- All our Kubernetes config is done with CUE - so I can define our own version of e.g. a CronJob and it will be repeated throughout the org with annotations & spec consistently defined.
\- Our tool & service config uses CUE (compiled down to YAML in CD) so we can define, say, an IP address for a service in a definitions file in the root folder and any tool that uses that service can import the config file and use it without having to pass in addresses with flags or environment variables.
\- 'profiles' for debugging multiple environments. I have 'dev', 'prod', 'staging' CUE files that I can reference as I need to, to change my local environment settings all in one go.
All the above are handled from centralised definitions in the root directory that are 'pushed out' to subfolders. Rather than a subfolder 'inheriting' config from parent directories. It's a system that works really well for config.
Recently we had an issue with floating IP addresses in DigitalOcean being disconnected from the VM, so I made a CronJob in Go that checks every 10 minutes whether the IP address is correctly attached and reconnects it if it is disconnected. I defined the ID of the IP address asset in a CUE file in the root folder (floatingips.cue), then in a subfolder I defined the configuration for that CronJob which uses the floating IP ID (devops/tool/floatingip-cronjob.cue) by importing the 'floatingip' package. I then say, in that service definition, that this service is a CronJob which automatically generates a Kubernetes manifest with the floatingIP information as a ConfigMap for that tool.
Then, when it comes time to deploy, the CD system loads the prod CUE file, compiles it all down to YAML and the Kubernetes manifest is applied to the cluster, along with all the config as a ConfigMap. Much less boilerplate (I have one service definition file that loads in all the config and generates Kubernetes manifests), and easier to debug locally.
1
A discord bot is usually running in an eventloop. so if for whatever reason, you need to launch it during the pipeline (why exactly would you do that?), then you'd need to catch the environment (runner/gh-actions) and add a check for that environment to exit clean.
1
A driveless NAS would be slightly odd, but that would be offset by the coolness factor IMO. Just gotta be clear with people that they need to get the drives themselves
Some NAS can still be useful as a very small server even without the drives. But I'm no expert there.
1
A few counter points, or reasons why I think it is not bad.
I think candidate perception wise it is a bit divisive (strong love hate for the practice! and changing in the mid term (likely a shift in the landscape where it is not leaning so much towards the candidate), but I can see that companies want to put up an air of being FAANG ish and mimic their success... Dress for the job they want in a way. I have had candidates surprised by my interviews because I don't have them live code something, rather focusing more on soft skills and approaches.
Another advantage of copying the big boys is that they put a lot of money in finding the right candidate for them. The type of person who would jump in to Leetcode with gusto may have the primary traits that a FAANG is looking for in their engineers. If a start up or other company wants to attract similar talent it makes sense to copy the practices.
I don't really disagree with the points you made. I hate the FAANG hiring practices, though I think they make sense given how big their candidate pool is, and thinking that as a smaller company practices at their scale will also make sense for them is a fallacy.
1
A general best practice here is just to go set up some billing alerts (see AWS Budgets) so that you're not taken by surprise. Takes 5 minutes — could save you a LOT of headache.
1
A good dsl is easy to grok and learn. That is why we use them. Too bad tf very quickly careens off into the weeds of complexity...
1
A good rule of thumb imo is that you should optimize for 5-10x your current size. Not for what you have now, and not for facebook scale.
Then, you bake in a good 5-10 years of growth before you have to rearchitect everything, while at the same time you're not making the system so needlessly complex that you need to touch 14 things just to make a small change to an API.
1
A guided DevOps meditation session for his soul:
https://www.youtube.com/watch?v=epcbx5HkCbM
1
A job title
1
A little bit of Jenkins. A whole lot of Vela.
1
A lot of linux work requires bash, windows shops use powershell.
Then you need some language to glue pieces together or make more advanced scripts. Python is the industry standard but if there is close collaboration with dev team other languages may be used. Preferreble Node since most have some contact with javascript, Ruby is okay but C# or Java shouldn't be used for such things unless devs are willing to support them 100%.
Golang - If you think of writing a simple tool in Golang then don't. If you want to write a Lambda in Goland because it's "fast" and "efficient" then think how many times per second that will be run and why you are writing it and not devs. The only good use case for GoLang is when you need to make a PR to some open source tool. Once again do not use GoLang if you can use Python instead.
Whatever language the open source tools are written in. But same case as GoLang, you ain't gonna need it. Seriously, I have years of experience as a programmer and I make just a handful or PRs a year and most of the time just reporting an issue and waiting a bit would be enough.
TCL - expect is sometimes useful, I'm sorry but I just like it. But if something is to be used for a longer time and by others it's not a good tool.
Whatever the devs are using on the project - If you have experience as a dev and know your way around software development it's good to understand the basics of what the devs are writting. You are not me, you are not that other devops you know, you don't have to be a better programmer than the senior dev on your team. But sometimes it's really useful to be able to understand what the code is doing.
1
A lot of the recent devops tooling has been based on the "cattle not pets" assumption. It sounds like you have a lot of "pets", which is totally fine. I would maybe look at building a preprod/dev environment where you can test rolling changes out to your servers. Maybe you can design an automated test that cleans itself up in a spare account? Or focus on automating the patching process? Ansible is a good choice for stuff like that, but I'd be wary about just running it in prod straight away :-D
1
A lot of tools have shoehorned aspects of programming to YAML.
1
A lot, if not most of that is being handled more and more by cloud providers. Eventually all we're going to need are people to glue some services together with a little bit of code - that's what software engineers do currently.
1
A Macropad. Makes commonly used commands or strings easy. I use it all the time for work it saves my a lot of time.
Here are some of the functions I run
-git commands
-open programs
-types my password and hits enter
-mute my microphone
-extend display to monitors
-mouse jiggler
1
A single instance target group attached to an ALB might be overkill a bit but it would work for what you’re asking. It costs a few cents an hour just to exist but so would a properly resilient HAProxy cluster. ALBs can do some basic-to-intermediate rewrite rules in their listener config. Wouldn’t call it a “monkey patch” but might just be your path of least resistance.
1
a Stream Deck from El Gato
1
A.blowjob
1
Absolutely true, but the caveat here is that, if a company is doing things "right", they'll have implemented a scalable solution from the get-go that scales on more than the technical facet. For example, yeah maybe that billion Lambda service is technically advanced and responsive to workload, but how about the cognitive overheard that it's generating for the development team(s)? Or the cost (in money and time) necessary to maintain it?
There's a fairly fine line between solving a problem with an eye to the future and deciding how you are going to solve the problem before you even know what the problem looks like.
1
Absolutely!
1
Accidentally deleted system.d on a Prod Debian System.
1
Acloudguru getting worse?
Anyone feel that acloudguru has sort of dropped in quality, first the sort of destroyed Linux Academy, not content is just being pushed out at an absymal pace.
It fells like they are eventually going to merge fully w/ Pluralsight. They were acquired jut for their Hnads on lab tech. (which was gotten from Linux Academy)
1
Add authentication to your tornado app, write a custom Auth handler, add something that writes a session token to an external cache e.g. redis - and creates a secure cookie. Your custom Auth handler in tornado should be able to read this cookie and cache and confirm authenticated. Don't create a login path in tornado so that there's no way to bypass the flask endpoint. Anyone that does without the cookie or token in cache will be 403'd.
Something like that the thing you're looking for?
1
Add some Cognito for federated auth against your service of preference and you are pretty much set. If you want to ramp up the complexity start playing with multi-AZ and multi-region replicas in order to have a highly available and resilient deployment, add some security hardening with IAM policies and do it all using Terraform pipelines.
1
AFAIK k8s is almost always maintained by a DevOps-dedicated team, not all of the engineers. At that point it’s well out of the woods of startup territory.
I shouldn’t have said “a guy”. I meant bandwidth. There should be someone taking the lead, though, and whether they have the bandwidth for that depends on the whole team’s.
1
AFAIK the storage/query component of VictoriaMetrics is not specific to observability, you should be able to use it for any kind of timeseries.
Some components like vmalert will not be relevant to you, but you can just not use them.
1
after using cdktf it’s really really obvious to me all these damn DSLs (cool yes, but) need to go away. just give me a cdk in a generic programming language, and get on with life. no one has to learn hcl and bang their head against the wall any time you have to do more than just a loop
1
Afterwards, I'll definitely change the employer.
1
Agile is another management principle for all the decisions I.T professionals take to produce a product. In my experience end-users do not have a clue what they want, how technology can help them, how to make things better or how to achieve it. They constantly move the goal posts (mid-project) and ignore integration and future needs. There is always a mismatch between what is possible and what is required.
1
Agoda is always looking for people. Almost everything is in English and depending on the role they'll organise everything all the visas etc.
Ping me for details, always good to make a referral :)
1
Agreed 100%. Our stack is very git-ops heavy with cdk8s transpiling and a custom k8s operator, so we make it clear that coding knowledge is necessary. We do a coding challenge, nothing leetcod-y, but more just something functional that you would theoretically have to do in your day-to-day work. But we also make it clear from the beginning that these skills are required.
1
Agreed and that is where the monitoring and continuous tweaking helps.
As you said, you implement a scalable solution, that enables you to stabilize current operations along with allowing providing flexibility to make changes as needs arise.
1
Agreed u/zuxqoj we (at Timescale) don't currently offer support for on-prem and (*as community manager*) I really wish we did.
If cloud-based managed services is an option – considering there's no license fee involved, the software itself being free – feel free to DM me and we can get you in touch with someone who might be able to look at this with you to see if it's viable.
1
Agreed with these points. From looking at the docs, it seems like the easiest solution when your use cases are simpler. Thanks!
1
Ah I see. You have to docker login then tag and push then docker login to the other registry and then tag and push
1
Ah man, forgotting terraform destroy.. been there, luckily just over weekend 😀
1
Ah yes, You would prefer the m140i over the m135i I assume.
1
Ah, I see.
Well if you primarily do webdev and don't need to build gui applications, you could use something like [VSCode Server](https://code.visualstudio.com/blogs/2022/07/07/vscode-server) in the container and then access your text editor/ide on the container host's web browser.
1
Ah, so what you are suggesting is to use Ansible to write the Playbook to setup the monitoring stack in any host I choose, and give Ansible my secrets for where it should configure to dump the data at, and then run it on the fleet? And then all the hosts will suddenly have monitoring setup and working just like that?
1
Ah, thank you.
1
Ah. I’m a data specialist who is now “devops”. I left when data science became a buzzword and because I wanted to focus more on how everything is built.
It was a good route because any log data that is crap I know how to fix and anything the data team needs I know what they’re talking about and why they need it.
If you go devops, it’ll just be the opposite for you if you go back into data science. You’ll know how to create clusters and how to host your data etc.
I think if you study the underlying principles (aka the equivalent of a good cs degree) you should be able to bounce around really easily.
I’ve worked in so many different fields and stacks. Let your passions guide your career.
There was a time where all I wanted to do was play with data but then I realized I was having more fun building the pipelines that got the data etc etc.
Don’t do a language or a stack “cuz it pays well”. You’ll be wayyyyyy happier at the end of the day.
1
Ahh, I see. Perhaps you can use runners but only on the same machine.
https://www.drone.io/enterprise/opensource/
I'm thinking of the context of OPs questions where they want to use it in multiple machines. Can't with OSS even if you can use runners.
1
ahhhhhh…. the famous “production” branch is born
1
Akeyless is a great one to look at
1
All data was replicated elsewhere, but that doesn’t help when the search powered by that ES cluster is *core business*.
1
All depends on requirement. But you should know how it compares against your existing systems and you should be ready with any proposals.
1
All that for an internship ?
1
All things considered at 100k salary and usually an employee costing 2x in benefits and ancillary costs….that’s 20h of labor equivalent. Per year. Your company thinks that 1 FTE even 1 week per year is going to provide that kind of business value?
1
Almost certainly not plausible for the pay to be high enough to make the headaches, health hit, and loss of control / authority worth it if you’re in a fairly decent place already. I’m talking at least $1MM / year USD compensation or roughly 4 figures / hour consulting rates which is oftentimes what an entire team bills at, not a single IC. Also with no on-call stipulated in the contract. Maybe if you’re under toptal but it’s not really paid hourly as much as per project usually.
1
Almost every operation is simpler and involves less lines of code in TypeScript than in hcl
https://gist.github.com/rawkode/809b400094e43e3d9f8cd619f2507027
1
Alright, I legit did not know about this. Main reason being that we run migrations separately from app deployments. Thanks for the information.
1
Alright, so just to continue this line of thought. Let's say host group A is nginx. I would be able to tell Ansible to install those packages. But I would also be able to tell it how to configure them without touching any config files? I'm getting a little stuck with that, because it feels the same: different hosts might all have nginx, but different hosts need different nginx configs. How does Ansible know? Or am I overthinking it? I am just so used to manually doing everything I am finding it hard to not cling to old practices.
I'm going to start setup of some test environments so I'll probably get my questions answered soon, but thank you for the comments and help!
1
Also alcohol
1
Also just remembered there was a new hire at my first company. The lady started working on and decided to clean up her hard drive. Only it wasn't her drive it was a network share, with all the code for the app that was supposed to go to the customer next week.
This was before git and the backups were all on tape. And the tapes were defect.
Everyone in the company was freaking out until they found a copy of the code on someones drive.
1
Also of concern is the app having permissions to modify database tables. If that same app is connected to the internet, then in a round about way, the internet eventually has those same permissions…
1
Also ordered wrong hardware for edge device (router). Spent a week trying to make certain IPSec tunnels work. Consultants hired. They are also stymied. Notunnelsup.png
Get second edge device. Three times the cost. Tunnels work at first config. Waste a week of work, consultant costs, and could have cut 1/4 of total capital expenditures.
Lesson learned: Buy enterprise grade hardware for enterprise level work.
1
Also, how did you manage to transition your knowledge from data field and profit from it in devops field
1
Also, you can think of a git repo as a place to control your automation. Running automation is done with pull requests. These pull requests double as a paper trail.
Also associate each branch to a specific environment. So if branch A, deploy code to environment A, using variables A.
If branch B, deploy this same code to environment B, using variables B.
Some environments will differ so your code will need to differ, but largely you can use the same chunks of code between your environments and st the minimum, it can be done in a repeatable way so you can incrementally improve it.
This also opens ip the testing side of things. Then you'll get to work on the release management which is the bigger picture.
1
Alternatively, use flask as a reverse proxy using requests, and lock down your tornado app completely. Auth up flask and only call tornado from authenticated endpoints in flask so the tornado app is invisible to the outside world?
1
Although I agree, Ctrl + Shift + Escape for task manager!
1
Always.
1
Amen
1
Amplify is really just a framework that brings lots of those services together. It’ll add in some auth and security stuff like Cognito too. Definitely not just clicking through a gui.
1
An alternative would be [https://sqitch.org/](https://sqitch.org/)
1
And bought your whole FUT with Company CC?
1
And how your comment helps the OP hmm??
1
And this is something you dont find in AWS docs - something they hide to not show how shitty their service can be sometimes.
Like removing policies from AWS Batch environment making it unable to be destroyed..
1
And valid questions like this, is why we still have security teams, assessments, and gating sometimes... Thank heavens...
There is just so much to know or understand.
Great questions OP and good you asked.
Another example, say you are working on a project and setting up backend storage... Let's say it's NFS or iSCSI and ip addressable storage. Let's imagine you give that internet (public) addressable space and it can actually route up and out of your network....
You should of course have perimeter firewalls that'll take care of that right? You ever see a perfectly configured perimeter security device yet? Because I haven't... I am hesitant to even claim there could be such a thing...
So, industry best practice would be to never give your storage, or anything else that does not NEED or want to be externally accessible, ip space for it.
Find a tinfoil hat, wear it proudly. It'll do wonders for you professionally ;)
1
And vpc flow logs for the eni of the nlb at least, depending on cost considerations
1
Anecdotally, cloud providers are ok about a first time screwup, as in they'll reverse charges up to a point. It's usually small potatoes to them and it's in their interest to have developers love them.
You still gotta be a little bit responsible and avoid doing big things until you gain knowledge and confidence.
1
Another one. Before the days of Prometheus and stuff like new relic/data dog, we had our own logging and alerting code. My code was complete and ready to deploy but also back then we had to go through CAB and stuff to touch prod. Prod deploy was delegated to a different team.
Code was deployed at 3am Saturday morning. Traffic was low, no problem. Come Monday morning, the said alerting went nuts sending a shit ton of emails every 10s. The deployment guy didn't follow the deployment steps and put the thresholds and backoffs missing a few 0's. The worst was we couldn't change it till CAB approved. We watched our exchange servers start failing globally.
After this, we were given emergency changes that needed a single manager sign off. They changed exchange to not failover out of region as well (I'm not a exchange guy so pardon me if I got this part wrong). I got in shit for allowing the code to be set to such low thresholds, but I wrote it to requirements so that saved me.
1
Any problems with it being a single point of failure?
1
Any Rogers employees from Canada in this thread? Come on. Fess up.
1
API gateway to call the lambdas, which gets called by the statically-served UI
1
applied anything? no, I found it to be anecdotal and mildly entertaining but I've also lived every scenario discussed in the story. I still recommend it and the unicorn project to anyone interested in a career in IT
1
Appropriate: https://xkcd.com/927/
1
Arch? Aw!
Alright.
And how important is learning Linux for DevOps? I see, in some places, just a basic knowledge of Linux is enough. Is that true?
One more question: How do I practice the DevOps concepts I learned? If it's some FE or BE development, I might have to do a project (usually contributing to open source projects) but how come with DevOps?
1
Are we talking.... Aws meta tags.... Kubernetes labels.... Labels in gcp...
1
Are you comfortable with the command line? It's less a question of what you use and more a question of what you're capable of using.
1
Are you kidding me? HCL is 1000% a lower barrier. But it certainly isn't as flexible as Python for example. You have to keep in mind though, HCL is so popular today because of two reasons alone:
1. The barrier to entry is super low. (It's easy to learn)
2. HCL solves 95%+ of all use-cases (which is a SWAG)
Things like CDK and Pulumi are trying to do really just solve for those more complex edge cases which is why they will never reach the level of adoption that Terraform/Terragrunt have.
1
Are you running the bot on the runner? Why? Your bot needs a computer to live, instead your runner is running the bot. You should be using github actions to deploy the bot to a computer, give it the startup command with an ampersand, and exit. Leaving the bot running on the target computer.
1
Are you saying Grafana Mimir is the best or the worst? This thing always throws me off, not sure if it is meant to be a "greater than" symbol or an arrow pointing to a progression to better.
1
Are you talking about ECS EC2 ? I used ECS Fargate for years and it's managed by AWS and you just deploy....
1
Are you talking about kind of how a [Gatsby](https://www.gatsbyjs.com/docs/graphql/) application pulls in all of its data and then serves it as a local GraphQL server? Only against live data/message queue, not necessarily just building static pages at build time.
1
Are you using any cloud provider? Or is it on prem?
In your situation I wouldn't bother with kubernetes unless you're on prem. I would go with something more basic like AWS ECS on fargate (not sure what equivalents other providers have). Not as powerful as kubernetes, but for most cases it will be fine and you'll have a lot less to learn.
If you do ECS then down the line you can still move to kubernetes. But you can do it once you've got a better idea of how everything's working together.
Focus on building docker containers based on the 12 factor app principals. If you follow those you'll get yourself into a really good position to build upon.
1
Are you using RKE or k3s for your cluster? RKE(2) is quite a bit more resource hungry than k3s, it wouldn't suprise me if it needed at least 4GB of RAM, k3s should be fine with less though.
As for network drivers...do you mean the Kubernetes network plugin or network drivers on your hosts? For Kubernetes I wouldn't worry about it and just go with the default (I think that's calico for rke and k3s). It's not really important if you just want to try it out.
1
Argo Workflows for pipelines and ArgoCD for deployment.
1
Argocd on openshift
1
Arguably I'm parroting stuff I've learned studying for the CCSP exam. If the link works there's two paragraphs on the topic at the bottom of this page.[https://books.google.com/books?id=vCLFDgAAQBAJ&lpg=PA76&ots=QbLt04\_4QH&dq=why%20you%20shouldn't%20manage%20encryption%20keys%20in%20the%20cloud&pg=PA77#v=onepage&q=why%20you%20shouldn't%20manage%20encryption%20keys%20in%20the%20cloud&f=false](https://books.google.com/books?id=vCLFDgAAQBAJ&lpg=PA76&ots=QbLt04_4QH&dq=why%20you%20shouldn't%20manage%20encryption%20keys%20in%20the%20cloud&pg=PA77#v=onepage&q=why%20you%20shouldn't%20manage%20encryption%20keys%20in%20the%20cloud&f=false)
Google has a bit about it. But they also want you to store it in their product.
[https://cloud.google.com/blog/products/identity-security/3-scenarios-where-keeping-encryption-keys-off-the-cloud-may-be-necessary](https://cloud.google.com/blog/products/identity-security/3-scenarios-where-keeping-encryption-keys-off-the-cloud-may-be-necessary)
I'm pretty sure the CSA recommends it: [https://cloudsecurityalliance.org/research/working-groups/cloud-key-management/](https://cloudsecurityalliance.org/research/working-groups/cloud-key-management/)
Vendor lock in could be a concern.
1
As a DevOps engineer I can confirm that this would be an awesome gift. However they’re really hard to find right now :(
1
As a DevOps Engineer, I don't get to decide (nor should I)
what language the application/service is written in. That's up to the software development teams. My only concern is creating IaC standard patterns that my whole team can support and that future engineers can take and improve upon.
1
As always, you would learn about the categories and how to win 😀
1
As I said in my other comment, you'd be surprised. You can likely run VirtualBox on your local laptop (unless it is just a total lightweight wuss) and do fine.
1
As I understand it websockets first used http then upgrades the connection to a websocket connection (using the http upgrade header)
Thus, you depending on where you want to restrict access you could prevent websocket requests from being started in the first place with a cookie/session authentication check at the server to deny the websocket connection unless there's a valid session
1
As long as the employer is not crap any experience is worth it. Learn what you can from everyone and everything. The only things to keep in mind are
you can go and find other jobs if things go south or you’re no longer learning. I often get so entrenched In a job I forget I have options.
Don’t let the company you work for be your only frame of reference. So many people I’ve seen know what their company does with a technology, and have no clue how it could be used outside of that company.
1
As much as I love Europe, US if money matters. I prefer having healthcare as part of my taxes, the public transportation in Europe, etc. but the salary for tech here is almost double European salaries.
1
As other folks have said it really depends on the role and the infrastructure you work with. Some places run everything in k8s, but there are plenty of shops using something else entirely (like ECS, Nomad, or just regular ol servers).
I would say generally that understanding how containers work will help you a lot in most roles.
1
As someone recently comparing ECS and EKS, EKS is still more complex than ECS (Google's hosted Kubernetes otoh is fantastic). But beyond that, everything Amazon builds integrates with ECS, versus everything third-party integrating with Kubernetes, so depending on whether you're expecting to use more Amazon stuff or general industry stuff that can tilt the balance.
1
As someone with a master's degree in this stuff, yes. Many businesses use cargo cult management techniques. And those are the not so evil ones. The ones run by narcissists or sociopaths are just big trauma machines destroying the lives of employees, customers, contractors and pretty much every one else they come in contact with. But our society doesn't give a flip about mental health, happiness or any of those things. To solve the problem we have to fix society. I wouldn't say I'm as progressive as some, but I will say that conservatism is evil. Straight up.
1
As things are now, I’d say yes and no to all of it being handled at the cloud provider level. I’m a big proponent of things like ECS Fargate in terms of setting up infrastructure that requires minimal patching, but I’ve also walked into situations where the dev team, or worse yet an external agency, had provisioned all their own infrastructure, and their dev and staging environments were entirely open to the internet, and their security groups were a mess. I also found read write keys checked into source. Large Shared VMs like EC2 aren’t going away any time soon and in that sense a lot of the old Ops/Security side concerns still remain.
1
As u/HeavyFreshToad said, SPF defines the permitted sources of email.
Based on your post I believe Google Cloud is where your DNS is set up, or at least it has a pointer to the actual DNS zone as an NS record (which might be anywhere i.e. in AWS…).
Please also be aware that your _dmarc record defines how non-compliant emails are to be handled i.e. outright rejected.
Then there is the DKIM record that proves the integrity of the email and _dmarc record might drop mails that do not have a valid DKIM section.
If you need consultancy, feel free to DM me.
1
Ask for developer-related training. You're in a junior positon. There is absolutely nothing wrong with asking questions. If your team isn't encouraging that, and you're actively trying to improve, that's their failure, not yours.
>But i know that i dont have the talent and the mind of a developer or a mathematic guy.
Substitute "experience" for "talent," because I doubt the latter very much.
1
Assuming you mean the more common infrastructure as code. And yes, this is the way.
1
At my company, devs attest that they have fun encrypted disks, and 10 digit password protected local accounts.
All services are accessed by SSO, with MFA enforced, and we use incident reports and anomaly detection to revoke SSO to all services when needed.
If Devs lie on their attestation and there's an audit or incident, they are reprimanded up to and including termination.
We don't allow non devs to BYOD though.
1
at my org, it is basically the AWS Admin, and I have to be present during design meetings
1
At my place we use C# mostly for Azure Webjobs and now we are a lot into Python for Pyspark in Synapse clusters
1
At the suggestion of u/drakk0n I have added information on whether the conference has a hybrid version.
1
At work we call it "Promotion Oriented Architecture" lol
1
AuthPolicies and ServiceEntry limited to the specific aws api url and using either serviceaccount or namespace as source check. And authpolicy for same sa/namewpace etc to deny everything else.
1
Awesome thanks! This is good info. I'm almost done writing my CloudFormation actually just having some problems setting up the policy for a user to upload to a s3 bucket. Once I get that sorted out I'm gonna dig further into setting up some tests.
1
AWS Cloud Practitioner, AWS Sysops, and AWS Solutions Architect. From everything I've heard/read, those 3 are a good baseline set of certs to get. TY, good luck your journey as well.
AWS disagrees with the premise of your title. They even make it easier by making ["Step Functions" available.](https://docs.aws.amazon.com/step-functions/latest/dg/tutorial-creating-lambda-state-machine.html)
1
AWS makes it ezpz to switch accounts, and provides clients in every language possible to query the API.
I think it will be much less work to maintain a Python script to generate an excel sheet with info the auditor wants, than to maintain a CMDB, automatically log all new resource creations into the CMBD, and spend time reconcilling errors when resources aren't in the CMDB?
1
AzDO
1
Azure DevOps may fit the bill (* looks outside to see if things are starting to freeze *)
At first I wasn’t a big fan, having used GitHub enterprise and Jenkins previously. But I’ve been able to adapt and do the things I want. Definitely not perfect but it works well for our needs.
I’ve been using gitlab recently for a different project. After coming from azure DevOps, I’m not a fan of gitlab ci. It may just be familiarity bias.
1
Back in 2012 I had to upgrade the size of a disk running on a cloud instance so that we could fit more websites for our clients on a custom CMS.
Well I followed the upgrade instructions to the letter!
Turns out those instructions really formatted the entire drive before resizing the partitions or swapping out the drive for a larger block storage device (can't remember which it was)
Anyways suddenly 300 companies websites gone in an instant. Panic sets in as my coworker and I realized our mistake. The instructions should not have included formatting as a step!
It was pure luck that I decided 30 mins before this to make a tarball of the entire drive. I needed to transfer a few important websites to another machine for debugging. That tarball saved my ass. Still took an entire weekend to put it all back together, but yeah. Lesson learned.
1
Back in the day I was logged into the vmware console of a Linux system with hundreds of concurrent users and the system moved about $1M in orders per hour. I was on a Windows workstation, hit ctrl-alt-del to lock my workstation while I stepped away, and of course the VM rebooted. We learned that day to disable rebooting when pressing ctrl-alt-del from the console.
Much longer ago I had a raid array fail on a system and I was able to bring it back online, only for the people that used it to discover three days later that there was a data corruption issue. The vendor had no solution other than restore from known good backups, so the company lost three days worth of work for about 25 people.
1
Back when I was managing AD, I was working on migrating to Office 365 in a hybrid setup. We had already been using O365 for email, but hadn’t integrated it into our AD yet.
I wrote up a power shell script in a testing environment to link our on prem AD setup with Azure AD, and all looked good. Went to do the same in prod (on a Saturday, nobody in the office), and all looked good. Until we noticed that EVERYONE in the company lost their email addresses. Mailboxes were still there, but they lost their email addresses.
Took me about 3 hours to fix it. Fortunately I managed to pull a report the day before for our users that happened to have all the email addresses in it. Wrote up another script that linked up the email addresses back to the users and all was good after that. That was stressful.
1
Bad documentation is almost always a sign of process failure - changes are being made rapidly, without an overarching goal, and the team doesn't own their own process, and because project managers don't give a flying fuck about anything that isn't the immediate deliverable (which they rarely understand), this tends to translate into no documentation.
This is why "bad documentation" almost always goes hand-in-hand with "bad product" and "dismal support".
1
Bah, rrdtool all the way.
1
Base docker images
1
Bash
1
Bash and powershell, I'm trying to learn and implement ansible and ruby
1
bash and yaml, intermittently a splash of python.
moved away from groovy/yaml/casc (jenkins)
1
Bash Is your friend
1
bash, cft, hcl, javascript, python, and still cannot move away from sql
plus way more xml, json, and yaml than i'd prefer
1
Bash, hcl, some python, groovy, used to be rather good in powershel
1
Bash, PowerShell, Python, HCL, yaml
1
bash, powershell. Cannot apply any programming language as such in my current role so I am playing with Go.
1
Bash, Python, Ansible
1
Bash, python, hcl (terraform).
For apis, usually Java (springboot), or nodejs/express.
For assisting product teams: Java, nodejs, .net, react.
1
Bash, python, ruby (because Chef), groovy (because Jenkins), and some legacy automation written in perl that I maintain. Our devs write a lot of Java and Golang, which we review PRs for but don't actively develop in.
HCL and Yaml probably don't really count but those too.
1
bash, python, various iac
1
Basically most containerd/docker orchestrators allow you to run "tasks" which are one off containers or sets of containers that run until completion and then exit
Another option is triggering things like AWS Lambdas or similar (serverless functions)
Basically, having used many CI systems and migrated between them on different projects, I've found the value of not "vendor locking" yourself to the CI system, and keeping it as a simple task runner. That keeps you flexible especially when CI systems change pricing or your businesses demands change. Less individual project churn, you can just run it elsewhere.
You could probably play around with the idea locally using minikube or similar. If you can run the CI process from your local machine then it's going to be straightforward to run it from most CI systems, because it means you've decoupled from its features.
1
Be extremely careful when trying to containerize anything you don’t manage yourself not because technology is hard or because of sharp edges with these tools. You don’t want to be in a situation where on top of your app problems you now have container and VM problems, too, is what I mean. I’m an experienced SRE familiar with K8S and everything and I work on a production environment without needing to run containers because for what we deploy it’s not going to really provide any advantages - we’re pretty solid and have a lot of other things to work on and we don’t get paged more than maybe a few days / year and those are silly things. Our monitoring is pretty rudimentary but it’s sufficient and we don’t need to do much with it because the application really doesn’t change much.
With that said, we have some containerized services for internal usage such as applications and services we wrote for ourselves, and those are simple enough that they just don’t break and we don’t care about patching them or anything anymore, so it’s a win. Next steps are to deploy an automatic dependency update tool similar to dependabot so we just approve PRs submitted by bots and run it through CI to make sure it doesn’t break.
1
Beats me, they already lost me at MySQL in Lambda :)
1
Because an ideal that doesn’t have a living example only exists as fiction. Doesn’t mean it cant be true.
The same reason we live in a world that is better than it was 1000 years ago. Unrealised fictional ideals.
1
Because the end user is computer illiterate. From their perspective they must click a button that launches the container. It’s acceptable to have some setup work where they (or their sysadmin) sets up an account and delegates privileges to my account, so we can run it for them on their dime, but that’s it. Ideally there would be some sort of oauth thing that would enable them to log in with the user and we would run it from there.
1
Because the tag is an actual reference of the registry the image is from, and subsequently be pushed back to.
1
Because thinking is hard and more importantly expensive.
I mean honestly I'll never get how devs of all people don't understand the benefit of lazy evaluations.
The other one is getting buy in is like selling a product, you have to have a big enough dream worth putting money towards or people might ignore your proposal.
Obviously, careful, thoughtful, and incremental changes based on real evidence of the circumstances should lead to better outcomes, but all that evaluation is expensive, and harder to pay for up front.
1
Because yaml is a programming language ?
https://www.redhat.com/en/topics/automation/what-is-yaml
1
Because you still need a shovel, lots of copper and lawyers to:
* dig a hole
* put a cable in it
* have the permits to do so
Even if you just license the usage of cables you need to be in a position to pay for all the physical labor _and_ expected profits to do so.
There’s only so much you can do with software only.
Even with 5G you’d have to invest massively as the frequencies require a higher density of antennas compared to previous standards.
1
Becoming good at k8s takes weeks - months. You can learn ECS Fargate in days. It simply runs containers in a "serverless" way and you have 0 maintenance to do.
ECS Fargate also integrates well with AWS ALB and ACM. So you get load balancing, SSL + SSL termination and automated cert renew.
If your service/app has a predictable user base and doesn't need to scale until infinity ECS Fargate is great solution.
1
Been using influxDB
To be honest though, I’m not well versed in DBs and have no way of telling if it can hit your transaction needs.
But wouldn’t that be more so based on the specs of the server it’s running on?
Open to corrections :)
1
Before moving everything to containers look into to automate vm provisioning and configuration, add monitoring and logging, setup ci/cd before moving everything to containers. The first two you should be able to do with minimal to no change to devs workflow. Moving to containers without monitoring and logging in place sounds a disaster since it can be harder to access a running container and figure out what's going on. Also it will require devs to change how they do things and might not have a ton of benefits for your app. As far as tools go I use Ansible with ansible semaphore for automation and keep a repo in the git system the devs use, for monitoring I would look at grafana cloud or running Victoriametrics, Loki, and grafana on prem and use Telegraf to send data, and for ci/cd it's going to involve the devs a lot so I'd let them have a say since they will be dealing with it more than you
1
Being a morning person is completely irrelevant to development. The traits you mention about yourself are actually quite stereotypical of devs as well.
That said... Job stereotypes are a terrible basis for choosing a career path.
Once you have a couple jobs under your belt you'll have your pick of companies to look for, and you can find a team where you click, have the right wlb, and whatever else. Learning dev skills will only make this easier to accomplish.
1
Being efficient with Prom stack takes time dude, just take it easy.
Start with operation sude of things (federated? Thanos as backend? Retention and sizing calculations, etc..). IMO it's easier to get started in K8s as prom stack has really matured over this platform and is easier to operate, you also get the added value of being able to understand all the prom operator k8s api extensions side things (servicemonitors, endpoints, etc..).
Grafana, for me at least, is much more intuitive once you connect the pieces.
When you're comfortable enough managing it (imagine you have a deadline for a prod ready monitoring solution on k8s for micro service based solution and just set it up) start delving into the wonderful realms of promql, I'd start with reading and understanding the rate function as an example.
Good luck, take it easy!
1
Below is my analysis based on the details you shared. But I would also like you to clarify, what exactly you are trying to do.
**My Analysis**
Based on the error message you shared, 2 things you need to look at to begin troubleshooting.
**1) Check if the ssh key is valid.** Probably need to look at "ssh\_authorized\_keys" = file("\~/.ssh/id\_ed25519.pub")
**Why?**
Because the error says "Error: 404-NotAuthorizedOrNotFound, Authorization failed or requested resource not found. "
**2) Check if the issue is with the provider version and if upgrading to latest version will fix the issue.**
**Why?**
Because the error says "Provider version: 4.83.0, released on 2022-07-05. This provider is 1 Update(s) behind to current."
1
Best practice:
One production environment with feature flags and ephemeral environments on each PR.
At my current company, we have this for our frontends, but not yet for our backends - but working on it.
1
Best way to learn kubernetes? I am experimenting with rancher and adding more hosts to the cluster, not having luck
1
Better in what dimension/by what measure?
Like understood AWS better? Were \`imagemagick\` contributors? Something else?
Not sure how to action the suggestion.
1
Better make it 4. He can run a k8s cluster. More nodes the better.
1
Biased: I'm currently CTO of Civo (although I'm leaving in a week)
Civo have a free (no strings) video course on Kubernetes, that may help you learn/get up to speed quicker. https://www.civo.com/academy
1
Biggest mistakes so far
Migrating db from one server to another. Didn’t account for how long it was going to take. My idiot self said “welp, I’ll just let the backup finish, and for it to transfer over and I’ll finish in the morning”. The huge delay basically lost all orders for that day, and some other process that depended on the database screwed over orders for the previous week. Took the company a week or two to recover and there was nothing I could do.
Other big thing was we were migrating out of a cage but leaving a customer behind to take over the cage. My boss told me to take apart the hitachi San. Ok cool. Turned it off, uncabled it but couldn’t get one of the panels off so decided to leave the more destructive stuff for later. About an hour later get a frantic call “ what the fuck are you doing? “. Turns out there were two hitachi sans. One was the customers, the other was ours. Luckily they only loss one file that wasn’t important from what I remember.
Oh! Yeah there was the time me and the other student programmer killed the mainframe. Multiple times In the same day. We were trying to reorganize a db, and wasn’t paying attention to the column constraints. Unknown to was a DB2 bug would lock a session when violating the constraint. Our script didn’t care and would move on to the next record. So we ate all the connections for the central DB2 server which in turn locked all other services out. They would have to hard reset the mainframe and bring it back online which was a 45min process. We didn’t care we saw the terminal freeze, grab a beer and some pop tarts, play wolfenstein 3d for a bit and then when the mainframe was back we’d try a new iteration of the script. Wasn’t till we saw that our account was no longer recognized that we figured we might be involved. It was kinda like that scene from airplane! When the lady starts freaking out and there’s a line of people to slap her. Yeah, all the mainframe people hated our guts that day.
Last one. Not a big screw up but funny to me at least. OS/2 network allows you to send messages to other workstations. Someone was sending messages to the print queue station that the student aides of the lab managed. The students were freaked out and asked me to look at it. I didn’t know so I decided to respond asking who it was. Instead of send I basically hit reply all, which replied to ALL os/2 machines on the network. One by one each computer in the lab pinged with the message. Like watching panes of glass fall over and shatter in a slow domino like fashion. The director of it had an office over looking the lab and the rumor was he was a scary yelly guy ( very much the truth ). He comes out and starts yelling “ARE YOU HAVING FUN?!” Left me pale and scared as shit thinking was I going to lose my job. My boss comes into the lab laughing his ass off. Gets close enough to say “ you spelled ‘you’ wrong” and then goes back to his office.
1
Bitbucket (sigh) and Jenkins (yay!)
1
Bitbucket is also in an AWS account, the IP addresses of the teams are whitelisted in the security groups for bitbucket. So this isn't ideal because it doesn't scale well, and there is a limit to the number of entries, so the infrastructure team has had to ask AWS to raise the quota a few times. We're just so decentralized that we run into these problems, but we're trying to place more guardrails around the teams to get more conformity in the future
1
Bitbucket pipelines are fine. If you host git on Bitbucket then it makes a lot of sense. We are currently running over 150 pipelines with multiple runners. Something like CircleCI is going to marginally better, but its another tool to manage.
1
bitbucket still doesnt show verified/signed commits either, and theres tickets open for over ten years asking for it. we share this as a meme at this point https://jira.atlassian.com/browse/BCLOUD-3166
1
Bitter much? Perhaps a bit of kindness could go a long ways here. Hope it all works out
1
blockchain synergies
1
Boils down to this:
Do you have deep understanding and experience with Operating systems, Networking, Infrastructure management, Application deployment, Development environments, and Programming?
If not, I'd start with building that knowledge.
Otherwise, sure, Kubernetes will have added value both in managing the company's systems and utilizing the knowledge and experience of a vast community.
I'd also add, Kubernetes is not a substitute for good engineering, although it is a well engineered system one can benefit by using it and learning from it.
1
Bootcamps do nothing except for briefly touching the surface of topics that are related to the term DevOps. I wouldn't put too much hope into these sort of courses. You cannot teach something that requires years and years of experience in a couple of days/weeks.
1
both
1
Both of you are partially correct. We have to understand that unlike in the past, IT powers the business a lot more. And a lot of business decisions are made based on IT capabilities.
So yes, even though the business may have failed because of business decisions, a lot of them could not deliver because their IT capabilities could not scale up at the speed they anticipated.
So yeah, when we discuss business failures, Technical growth will rarely get the credit for the failure as all the decisions made were a business decisions. Blaming IT would be shifting the responsibility to the team which most leaders avoid doing.
I hope you understand the context.
u/Sparcrypt yes I have personally seen teams have difficulty scaling up operations at the speed of business. This in turn lead the business to scale down its ambitions and eventually close down.
One was a small local bank and another a high profile insurance company that got sold off. Cannot name either of them due to legally binding contractual terms.
1
Bots
1
bro that guy who liked pulumi is exactly the case where i work. he basically built the entire infrastructure with a ton of questionable decisions and tool choices:
\- for pulumi he got it to automatically append tags but didn't keep async stuff in his head so now there's inconsistency with some stuff happening sync and some async which results in occasional but annoying state mismatches
\- we use pdm instead of pip or poetry or something else tested by the market because he liked having a folder for modules instead of a venv
\- we have a cli tool taht's meant as first line of defense before eng asks us questions and bro the tool is so buggy and it requires to have docker just so that nobody has to deal with secrets and configs
\- we use rke2 for cluster creation which is fine but fucking annoying since we could've just used a managed solution
\- we use helm as a kustomization object and have to deal with hundreds of lines of unidented strings and you can't put them in separate files and just include them
&#x200B;
im still new in devops so this infra has helped me learn a ton in depth about everything i've worked with but man has it been a hell! we could've done so much more useful stuff for the business if not for these decisions lol
1
Bro those aren’t recruiters it’s scammers lmao
1
Broke a VM - had to restore it from capture, lucky SOP was to take captures before making modifications.
Ran load test against production instead of test environment during peak hours. Luckily no loss of data or service but everything slowed down considerably.
1
Broke production galera cluster with 250+gb data.
1
Build scripts should be in git with the source.
I had someone today create a three separate repos with k8s yaml files for three new src repos and i had to tell them to quit it with that shit. Yml and src like to be cozy.
1
Build scripts should definitely be committed to DVC (distributed version control, i.e. Git). If this is part of the repo they are build scripts for or if they should be in a separate build-scripts repo is largely a matter of preference.
1
Build the actual application first and run it on a EC2 instance, then try to containerize it, then try the serverless route. This way, you learn all the ways you can run your application. In sense of actual infra, yes, that would work but there are other ways to run the same setup.
1
Buildkite and GitHub Actions
1
bust out tcpdump
1
But the lights go out automatically, so nobody will see you cry! :D
1
But this will be pushed to all the pods across the mesh. I don't want this. I need to restrict access just for particul set of pods.
1
But wouldn't you then need to run Nomad yourself? I get the argument for ECS with workloads that don't need many dependencies (third-party Helm charts).
1
But, and it’s worth emphasizing, talking companies off that ledge is our responsibility as professionals. You can even acknowledge the situation when communicating a recommendation in nay-saying one of those things. “I have a professional responsibility to recommend a course of action that is best for this company & not best for padding my resume, and so we simply don’t have the organizational complexity here to merit micro services. The increased complexity will cost much more than it’s worth”.
1
But… HCL is a programming language?
Or are you one of those “let’s do flow control in a document” monsters?
1
By 'their own machines' do you mean that they are developing on personal devices?
1
C# mostly, a touch of bash, and some yaml (which isn’t really a language)
1
Can also give [doppler](doppler.com) a try.
Works really well with Kubernetes systems for our case. Supports auto updating deployments upon configuration updates.
1
Can confirm this. Now it’s “which arguments do i pass” and “what environment variables do i need again?” “how does docker decrypt my kms encrypted secrets?”. Or the latest “why isn’t this docker image running on my m1 mac”.
I’d encourage asdf for terraform, i switch versions daily.
1
Can I ask what are the specs to run such a home lab?
1
Can I get your job then? Lol I've been looking for a jr dev ops position.
1
Can someone explain to me please what are the benefits of using kubernetes inside aws ec2? What's the point of putting a virtual environment inside another virtual environment? Wouldn't I be better off orchestrating using aws api?
1
Can you even do rolling/A-B updates for BGP?
What would even happen if there are two conflicting configurations? I really should refresh my routing knowledge.
1
Can you give an example of a complex problem that Terraform was harder to work with over using Pulumi?
1
Can you pick out a feature you miss in bitbucket compared to gitlab?
1
Can't codepipeline by *that* GUI?
1
Can’t say I’d recommend using consul without vault. Consul as a backing store _for_ Vault, sure.
1
Candy and unicorn swag?
1
Cargo cult is the biggest problem in IT in my opinion. Most IT professionals struggle to explain why something should be the way it is. Often I've been told "It's what is in the book", "There are 4 other developers who agree", etc when asked, "What benefit does this provide?"
1
Cargo-culting is (along with tech debt pay down strategies), imo, the most common tragic flaw in our industry. Oftentimes it’s a terrible engineering decision but advantageous “professional development” decision since the implementing engineers can take, say, a kubernetes migration project and flip it to an SRE role.
People have to realize that FAANG, though they dominate the industry by market cap, have an outsizedly inappropriate influence on design patterns — for all aspects of the software engineering life cycle. Simply put: FAANG is 5 companies, with hundreds of thousands of engineers (probably nearing 1mn altogether). By the numbers of companies (and thus, decision contexts for design), it’s 5 of tens of thousands.
I could build a fucking encyclopedia of “things that FAANG does that you almost certainly should not do”:
- interviews in FAANG style: most of what they do is to combat rote mem of a limited test bank, and more directly, straight up cheating. It’s much less likely that your 100-person company is going to be targeted for cheating. Just ask about git and have a single example problem you run through every time. Randomization is inefficient and entirely unnecessary.
- microservices: NO. Why introduce code boundaries when you don’t have to? You want to have a guy who’s sole responsibility is git/kube/helm/iaasP/docker/etc., and nothing else? Go with a monolith.
- gitflow: NO. Why build a system around concurrent contemporary versions if you are completely in control of deployment of a single web app? Have a hot patch workflow but don’t do multiple versions. Just implement simple feature flags for toggling a feature on once it’s passed review, but always have trunk be moving forwards.
- Terraform: Unless you’ve already got a guy who can write & maintain it, don’t go down the rabbit hole of declarative infra mgmt. See Spotify’s examples of “woops, we nuked our DNS when we were still learning how to do this process right” to see what would otherwise consume your time even if you had a devops-only team (which they do).
- secrets management: you can trust your team, you don’t need to build a 12 step process for on boarding someone. Give each dev access to most everything, just with some reasonable “ok not absolutely everything” ring-based control. Two tier security levels in your org is enough for most dev teams.
- types: types are great for large organization where the complexity of your object model is too much for anyone to have memorized. But if you’ve got like 8 objects, consistent & coherent cross-stack types are unnecessary. Spend that effort on refining your requirements-focused end-to-end testing scheme.
Decisions of the above-mentioned things can be the make or break of your business. Or maybe better said, if you & your workmates enjoying your job.
1
Cash is king.
The numbers game show about 90% of start-ups fail, with 70% in 2-5 years. Ask yourself, would you invest $70k/year into this company floating your new offer, knowing there’s 2-1 odds you’ll lose every penny of it? Would _you_ hire engineers, asking them to invest 40% of their salary into your company as a condition of their employment?
This doesn’t even address the nightmare that is Azure, and how miserable Azure based startups can be when it comes to a variety of processes. I once worked at a shop that was migrated from GKE to AKS; it was an absolute nightmare, but a hard requirement as MSFT was sending us desperately needed sales leads. The needs of the business completely outweighed the operational pain.
As an ops type person, getting in on a super lucrative startup is pretty much a crap shoot. You probably won’t, and making reckless decisions because someone’s flashing lots of IOUs at you isn’t going to make you crazy rich. More likely, it’ll make you burnt out and frustrated at how little money you’re earning and how high the pressure _everyone_ is under, because they’re equally personally gambling on the company making it big. I know it just isn’t worth it to me.
1
CDK saved my life.
1
Changelog.md generation is so common (pretty much every release plugin supports it) and it requires commiting back to the repo.
The same with documentation updates after new version is released.
1
Check out buildkite!
1
Check out Sysdig
1
Checkout Loki for logs, it integrates with Grafana.
1
Cheers, that makes sense. Slowly getting my head around it!
1
circleci, github actions. now i'm starting to learn aks with jenkins
1
Clearly you have no interest in anything other than being contrarian, and acting like you are the smartest person ever. I am familiar with group think, and asking if I've heard about it, does not mean what's happening here is group think. OP posted something useful that could have equally been a recipe, a game, a utility, it doesn't matter. Various individuals believe it to be a valuable contribution, sans you - if they arrive to their conclusions individually, that's not groupthink. If you find no value in this for your use case, cool. Clearly others do.
1
Close, it was saltstack creating a file rather than just writing to it.
1
CloudFormation change sets are the equivalent of Terraform's plan.
1
Codefresh, it's very convenient for this stack.
1
Combining GitLab (CI) with \[Qovery\](https://www.qovery.com) (CD) sounds what you need.
1
Comes from old school sysadmin world view where if you can't touch the hardware, does it even exist?
And the idea that any janitor can walk into an Amazon data centre with a USB drive and pull down your super secure customer data from a specific server which has the EBS volume for your database, and then decrypt it with all the KMS keys they have lying around.
> If you aren't willing to trust a cloud providers networking, compute, and encryption, then you shouldn't be using them.
+1. You either trust them, or you don't. Everything inbetween is just security theatre and compliance checkboxes.
1
Coming from mobile devlopment , I think you have other problems :-) using GUI Git or not doesn't make difference.
1
Companies should also make it clear in the Job Descriptions what their looking for...
I come from a Sys/PlatEng background, so my coding is not as good as someone with an M.S. CompSci degree.
If I evaluate the position and your screen makes it seem like something like that, don't give me LeetCode type problems just because you don't know how to screen properly. As a senior engineer, my goal is to solve your problems (design, arch, org, pipeline), not church through rote-code memorization.
1
Completely agree. It was kind of a fun light read for the most part but also infuriating at being so naive and hand wavy. Somebody learns about event sourcing, they implemented it in a weekend and fix the discombobulated reporting system, give me a break.
1
Computers are super fast these days and most developers and SREs are basically overengineering stuff in the end honestly because they lack the time to right-size things and make it easier to maintain with less thinking involved. You can get away with quite a lot of vertical scaling for a SQLite database these days, especially if you use a solid NVMe SSD, so unless your company is trying to make that some flagship SaaS product I wouldn't be too worried as long as the application itself is fairly stable and not terribly error-prone / buggy.
Any idiot with some time and access to the Internet can take a random application and shove it into a container haphazardly these days - that's kind of step 1 of maybe 30+ such as monitoring, disaster recovery, etc. A professional understands the lifecycle of the applications & services and weighs the effort against other options and priorities in context of the organization's goals and resources. Maintaining containers and producing them is a non-zero cost.
Fun fact, for a long time VMware's own internal IT didn't use VMware long after IPOed and everything. Why? The overhead of managing VMs wasn't worth the gains possible for the organization! Think carefully about your own goals and what you really want out of the job. I'm seeing some vibes of resume padding which is probably not a bad idea honestly but being judicious about one's use of time is likely the #1 professional skill that separates the overworked and constantly behind from the successful and fulfilled in their careers.
1
Concourse is really great, love it especially for multi repository pipelines, their resources system is very flexible and allow for really interesting scenarios.
So far I haven't find any usecase it could not achieve despite having fairly complex or specific pipelines.
1
Congrats on the move! Its a big deal for your company I'm sure. I recently (with lots of help) moved our entire company to k8s and rewrote our internal platform as a service. It was a blast because k8s is so powerful and there are great patterns for extending it
Recruiter pings are meaningless they have no idea and are just searching for keywords
1
Congratulations, you’ve just proved you don’t know what declarative actually means.
Pulumi is “completely declarative”
1
Consulting is also a nice way to try part-time work (and still pay the bills). Just need to acquire your own health insurance if you're in the U.S.
1
Consulting really depends on the company. There are two local consulting companies I’d like to with for, one I’d consider working for, and several others I’d only work for if I was desperate.
Why? Two of them really only take projects that are green field or moving legacy to cloud. They have a great reputation and seen to treat their staff well.
The tolerable one has all sorts of work, but at least they treat most of their employees well or try to do so. But, they are big enough that I’ve seen people who weren’t getting what they wanted leave.
The ones I don’t want to work for don’t have a great reputation. They are about money only
1
Control plane is the easy part. Ci/cd, monitoring, load balancers, external dns, secrets management, cert management, logging etc. is hard.
1
Converted two private subnets to public subnets. Apparently these subnets were also attached to some of our production systems. Our internal systems started to fail in communication.
Reverted the changes one hour later.
1
cool.
1
Correct.
1
Correct. Just saying if I had to use Pulumi, it would only be in YAML which kind of defeats the flexibility of it all but at least everyone knows YAML. I'm with you though... Terraform is the defacto standard.
1
Correct. Maybe as an added bonus watch the 10 Deploys Per Day Flickr presentation from Velocity 2009 that helped to kick-start the movement.
https://m.youtube.com/watch?v=c6tWX48tmAo
1
Correct. There is a helm chart and you can conveniently specify cpu/memory requests for each job in your pipeline, so the jobs get distributed over your cluster. Scaling it up and down as needed.
[Try IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) we create the clusters with CDK and deploy runners for each team that have access to other AWS ressources of that team only. E.g. secrets, keys, parameters.
1
Cost of living should not matter. Its a dick move by corporations to be cheap on ppl.
If the cost of living is super high - you think they increase salary much ? Fk no. In US so many ppl are wage slaves.
1
Could have the image build pipelines clone from a repo that just has the script, so it's only in one place and they all get it. Clone it to a standard directory location and they can all call it from there. Removes the need for the second container.
1
Could this not have been done with a git alias and xargs?
1
could you specify why it sucks?
1
Couldnt you do this with a network policy?
1
Create a GitLab repo, spin up a local runner (I can help if you need, DM me), and use that to learn some CICD. It’s pretty simple. GitHub actions are great for this too, but I don’t have the experience there to help with that.
Learn about terraform and maybe try some minimal “hello world” stuff with it. You’ll get a lot of miles from that knowledge.
1
Creating new pages every time someone experiences/does something but never putting them into an unorganized confluence set up instead of maintaining documentation in a sensible structure
1
Cries in Chef
1
Cringe
1
Crossplane IMO. While pipelines for infrastructure have their use cases I believe they should generally be phased out.
1
Cryptographically speaking, no. Preventing that exact kind of man in the middle is exactly the point of mutual TLS.
The way that I typically resolve this problem is to give the proxy authorization to attest clients, and have it pass along the client identity in a way that cannot be cryptographically linked to the client by upstream services via some kind of header such as an Authorization token or X-Forwarded-Client-Cert.
When the upstream detects one of those client attestation headers (asserting a client identity without proof they actually are the client), the upstream checks if the client actually making the connection (the reverse proxy server in this case) has Authorization to perform that attestation on behalf of that client and if that check passes then the upstream proceeds using the identity presented by the reverse proxy.
1
Ctrl + Shift + Esc does access the task manager too.
(Through Ctrl + Alt + Delete seems to do a better job if the machine is almost unresponsive)
1
Ctrl+Insert: Copy
Shift+Insert: Paste
1
Cuelang.org
It’s a superset of YAML & JSON that DRYs up config without using inheritance. Compiles down to JSON or YAML. Made by a Googler.
1
Cumbersome unless you are using environment secrets with GitHub Enterprise ($$$)
1
Currently evaluating a few options for a similar size platform (well… about 1B total active monthly ), and VictoriaMetrics has stood up really well. Just did a 10x growth test and my cluster scaled reallllllly well.
1
Currently renting bare metal boxes to size, not using any cloud providers. How does Kubernetes behave with bare metal? Need to get my hands on some lab equipment to start these experiments.
1
Currently renting bare metal to size. No cloud providers in use. Those bare metal boxes just have VMs on them and when more VMs are needed more boxes are bought. It's currently Proxmox manually provisioned on top of Debian, because there is something going on with the main host. I think there is a backup of the filesystem using borg.
Moving to the cloud is another undertaking entirely, and certainly would raise operational costs.
> Focus on building docker containers based on the 12 factor app principals
This is something I need to look into further, thank you
1
Cyberarc has something called conjur, see github https://github.com/cyberark/conjur. no experience with it however.
1
Daily use: bash, python, yaml, occasionally useful to know php and sql to suss out what's going on in the apps I support.
Would like to learn and use: golang, rust
1
Damn it, now you've given me more things to research :-p I will definitely look into Neon. I took a look at the website. Seems like a cool idea if it actually works. I've worked with a team before trialing different PG implementations on K8s, but we kept going back to VMs. Looks like what they are doing is decoupling the DB engine from storage. Here, we still have to host storage, though. Tough to get away from hosting live data in some capacity. I like that their backend automatically pushes cold data to S3. I'd be curious how much customization is possible on that, as I've worked in HPC and Big Data environments where our data tiering could cause compute bottlenecks. But we had tape archives, too...so it would literally sometimes initiate a call to a robotic library to load an LTO tape while an informatics person would be scratching their head wondering why it's taking so damn long to `cat` a file lol
1
Damn right lol
1
Dang that's pretty ridiculous. Getting fired for that seems a bit extreme.
1
Data Analyst.
1
Day of week, day of month, …
I’m a time series finding the shard key _usually_ is „easy“ part
1
Dedicated servers on Hetzner with self hosted GitLab.
I think Hetzner cloud offering also includes a managed GitLab now.
They're a solid choice and very cost effective.
1
Define 'spend huge' - I believe InfluxDB will scale to that sort of load with the enterprise side of things.
1
define coding though. Most DevOps / SRE define coding as configuring YAML, JSON and doing some IaaC (terraform, etc), with maybe a little bit of scripting.
That IS coding, but not software engineer level coding.
1
Defining a daemon is indeed best practice, but baby steps :^)
1
Deleted /var direcorty instead of ../var. Before rm had root check
1
Deleted a windows NT OS on a server that was home to our source code control!
Raid 5 saved me.
1
Deleting the whole node,
1
Deny all but RELATED, ESTABLISHED, of course.
Figured I didn’t need to explicitly say that cos obviously that is what you do. Sheesh.
1
Depending on the country, that should be your biggest win of your career.
1
Depending on which CICD platform they're using, it may already have an artifacts feature that you can also set a lifetime policy for like retain last 5 builds.
1
Depends if you can push or pull or both, I take you might not have access to a lot of commands being embedded systems.
I would tar and sftp the archive if I had to push.
I would use rsync to pull the data to a backup server.
Either way, I would have the file somehow commited to a git repo, and. probably do some git crimes while at it.
A few possibilities:
* every server gets their repo
* one repo for everything, with servers in different folder
* one repo for everything, with servers in their own branch.
* one server per environment (one for dev, one for preprod, prod) and another one per component/app/server type. I would push twice one to the right environment and one to the right component/app/server. That way you can look at a fair amount of data with high cardinality in the right context handy to compare.
Some of those possibilites are questionable, and could really break your git server.
A humongous amount of data in a single repo tend to not be the ideal usage of git.
Do bring more details if you want me (or someone else to chip in)
1
Depends on your needs.
GitHub Actions and Bucket Pipelines are both good. I guess, if you've been previously using GitLab CI, these would be the natural choices.
If you're looking for mobile CI/CD, look at mobile-first alternatives such as Codemagic or Bitrise.
1
Depends what you’re doing.
You could easily create a routing loop, blackholing traffic.
i.e. router A thinks it should send traffic for destination X to router B. But router B thinks it should send that traffic to router A. And it bounces back and forth until TTL expires never getting where it should.
1
Deployed a bad release to prod accidentally, every minute of downtime cost us $1M+.
1
Developers (not "ops" or "DevOps" people) deploying to production multiple times per day.
1
Development cycle is much faster because it compiles down to a binary faster than the Python interpreter starts, and you can cross compile meaning you can build for Linux on your Windows machine, upload it to a server and test it faster than you can with Python.
1
DevOps as a whole is already becoming over engineered, so no pulumi for me. Hcl will do what I need.
1
DevOps guy here, I highly agree with the roadmap. There's a lot to cover before one is able to make any sense of what is going on for DevOps.
1
DevOps is a skill and a craft:
* Baking a cake. The first few cakes you bake are going to suck, but the more you do it the better they will be. I have recently started watching "Is it Cake" on Netflix \[not an ad for that show\] and i am blown away by their abilities to take tons of different techniques to make cake look like a real life object. That takes time + energy + failure + learning to get there. Every baker on that show has put time into their craft to get as good as they are.
* Pumping out a septic tanks. It probably takes a lot of trial an error to get to a place where you can pump out a septic tank and not end up covered in poo. That takes time + energy + failure + learning to get there.
* Being a machinist. You are going to scrap a lot of parts by messing up some tight tolerance before you become Abom79 who is a shining example of mastering their craft. That took time + energy + failure + learning to get there.
I think a common thread across the examples i have listed above is that the outcome is generally pretty well set from the beginning. Cake, no poo, metal that serves a function, etc.
In DevOps, i feel like there is too much importance (early on) in skill development without really understanding the goal that DevOps is trying to achieve to begin with. My background is in software development and i moved to a heavy IaC mild DevOps role because i found software development (in the space i was working) to be extremely dull and boring. Over time i continued to shift more into DevOps and less pure IaC, i had to spend a lot of time disseminating the difference between what you see in blog posts and what a business really needs to be successful with DevOps. Also, working in a large organization, i had to come to the realization that there is NO RIGHT WAY TO DO DevOps. No matter what you do individually, it will be wrong.
DevOps is about working with a team and optimizing flow of value to the market. Lots of skills and skillsets will go into that optimization. I like to believe that my software development skillset lends itself well to the atmosphere because i am able to understand how the devs are thinking and how the infra and ops people are wanting things done to help increase that flow.
With this long response, i think what i am trying to get at is this TLDR; DevOps is like any other craft out there but is actually less about being really really good at any one thing (though it does help), it is about optimizing the delivery for an entire team and helping that team overcome challenges and obstacles. I would talk with your manager(s) about how you start to obtain those kinds of skills through the lens of optimizing flow of delivery.
For Jrs understanding suboptimal delivery flow can also be a challenge, but any Sr worth their weight will be happy to tell you all about their delivery challenge horror stories to help give you some things to focus on and unravel.
1
Devops is NOT about tools. Devops is about enabling flow of delivery. Tools can be trained on the job, and more often than not, are trained on the job. Understanding an SDLC and what tools fit into that SDLC is far more important than asking the question "if i know this tool, can i do devooops?"
1
Devops is not the path to ML …
1
Did something similar during migration of no IaaC environment to terraform code.
VPC module updated all routing tables removing vpc peering routes :))
Spent whole night fixing it and kept my mouth shut the next day.
1
Did you consider a layer 7 firewall? Seems like managing ip based rules is getting more and more untenable…
1
Did you consider hosting your own gitlab instance? Check https://docs.gitlab.com/ee/install/azure/ its much cheaper than gitlab premium tier.
1
Did you make any kind of avoidable mistake / is there anything you would do differently in hindsight?
1
Did you see significant cost reduction in operations or overall cost, after? Im curious at what level of $$ management suddenly changes mind.
1
Did your employer pay for it?
1
Didn’t even know about Devops until Frank Niu mentioned it and figured that’s what i wanna specialize in. (After i get basics down in all other areas ofc)
1
different approach that might not work for your workflow:
do you need to actually commit that thing each time the pipeline runs?
why not put 2-3 `<branch>.html.tmpl` into a `tmpl` folder somewhere, and `mv ${projectRoot}/tmpl/${refs/head/}.html.tmpl ${projectRoot}/index.html`?
I can only possibly imagine 3 different html templates, them being dev, (staging?) and prod.
1
Dig
1
ding ding ding!
1
Disclaimer: I am a Pulumi employee. I was also a Terraform user from the first release and some of my closest friends are former Terraform maintainers.
Most of the comments here are following the same pattern that these threads usually take.
Lots of people come along and say Pulumi is easier to use, more expressive, and [makes you more productive](https://www.reddit.com/r/devops/comments/vvcle7/for_those_who_tried_both_pulumi_terraform_which/ifkfqz2/).
Then a bunch of people come along and say "Terraform is the industry standard, HCL stops you from creating footguns, [everyone knows HCL](https://www.reddit.com/r/devops/comments/vvcle7/for_those_who_tried_both_pulumi_terraform_which/ifjuo51/)"
Eventually we're going to have that suite of users who come along and say "yeah Pulumi makes you more productive, but do you *really* want your users creating your infrastructure?! they don't know what they're doing!"
The industry is changing. Terraform was a very welcome and needed tool 5 years ago, but it isn't innovating in the space, it has a bunch of fundamental design flaws which are going to be very difficult to fix and the industry pioneers are very quickly moving to Pulumi.
You have two choices: stick with the tool you're used to and that makes you feel safe or move with times and embrace a tool that is faster and empowers the engineers you're working with.
1
DM me
1
Dm me. Ive used some python modules that worked well i can share. Super easy in any pipeline.
1
do it incrementally, as you work on proper tasks
1
Do not save secrets inside images.
Volume-mounting the certs directory is the sanest approach, either from the host, within a named volume, or via Docker secrets (though compose may not support that. not sure; I don't usually use compose).
You could also write a custom entrypoint to fetch the certs from an external store, though that's more to maintain.
1
Do the following. You should be good to go.
If you can, get certified in the following if you think it will help in your country/city . - One cloud Associate cert, CKA , Hasicorp's Terraform cert.
Linux - Pick one of the YouTube videos to learn the basics and get your hands dirty by installing on a VM and play with it or use WSL2.
Python - do this course [https://www.edx.org/course/cs50s-introduction-to-programming-with-python](https://www.edx.org/course/cs50s-introduction-to-programming-with-python)
Git - pick one from YouTube and understand how GIT works
Cloud platform - PICK ONE - AWS ( Adrian Cantril's SAA course [https://learn.cantrill.io/](https://learn.cantrill.io/)) or GCP ( Dan Sullivan's course - https://www.udemy.com/course/google-certified-associate-cloud-engineer-2019-prep-course/) or Azure ( i don't know who is a good tutor for Azure).
Jenkins and Ansible - [https://courses.morethancertified.com/p/devops-in-the-cloud](https://courses.morethancertified.com/p/devops-in-the-cloud)
Ansible - [https://www.udemy.com/course/learn-ansible/](https://www.udemy.com/course/learn-ansible/)
Terraform - [https://courses.morethancertified.com/p/mtc-terraform](https://courses.morethancertified.com/p/mtc-terraform)
Docker -[https://www.udemy.com/course/docker-mastery/](https://www.udemy.com/course/docker-mastery/)
Kubernetes - [https://www.udemy.com/course/certified-kubernetes-administrator-with-practice-tests/](https://www.udemy.com/course/certified-kubernetes-administrator-with-practice-tests/)
Projects
Cloud Resume Challenge - [https://cloudresumechallenge.dev/docs/the-challenge/](https://cloudresumechallenge.dev/docs/the-challenge/)
AWS + Terraform - [https://courses.morethancertified.com/p/rfp-terraform](https://courses.morethancertified.com/p/rfp-terraform)
Tons of small projects - [https://github.com/100DaysOfCloud/100DaysOfCloudIdeas](https://github.com/100DaysOfCloud/100DaysOfCloudIdeas)
&#x200B;
I personally would avoid Bootcamps as they are expensive and they are just an overview on the individual topics.
1
Do you really think HCL has a lower barrier than Python, which you can also use for scripting, for configuration management with Ansible, for software-defined networking and building your applications, which are the purpose whereas infrastructure are the means ?
1
Do you have a guide on how to create a dev environment? I want to do the same for visual studio project.
1
Do you have any advice on any personal projects or career development steps for a college student who wants to break in? I'm majoring in IT, not CS which means i take a lot less (but still a bit) of programming and system design related classes, so obviously I assume I need to take time on my own for making up those shortcomings.
Currently I have an internship working for a NOC team and actually got a part time offer, but there really isn't much opportunity to learn about the traditional sysadmin-esque concepts and tasks. Working full time as well as school, while simultaneously not learning a whole lot for where I want to end up during work makes it hard to balance ALSO studying in my free time, but I figure that I gotta do what I gotta do.
I suppose it's hard to give advice when there's so much detail lacking from my situation, but maybe even just some general advice would be immensely appreciated.
1
Do you have any experience translating Web3 business-speak into actual technology?
So far I've only been able to understand Web2 + BlockChain = Web3 Decentralized Internet Magic.
At a DevOps level it's still classic SDLC management, but the startups are trying to not use FAANG but then they need to scale, so where do they go? AWS or Google... Is my grey-bearded showing?
1
Do you have something specific for point 1 that you can recommend?
1
Do you mean it's better to keep on Googling instead of reading books (if I'm totally new to the concept) as most books start from "beginning" - over and over we see the same concepts in most of the books?
1
Docker
1
Documentation that's hardly worth the description.
I'm not a fan of uncommented, vast cut&paste orgies.. I'd rather like to know the ins and outs.. What are we trying to achieve, how does it work and maybe a little history about why it's in its current desolate state.. and THEN a couple of example commands.
1
Does Gatsby require the GraphQL server be reloaded to reload data? If that's the case then no
The idea being is that the backend server has no business logic and is only a sink and source or relay for events. It does very little. You could keep it up for years at a time.
The frontend SPA and backend backend could change rapidly and independently without affecting the actual internet facing component.
1
Does it? I must be blind. https://docs.docker.com/registry/configuration/
1
Doing manual or separate migrations before deployment are an anti-pattern tbh.
1
Don't care. Reference it in the text at the minimum and make sure it's very close to the text it's referenced in.
But good job at having a fight about "how documentation", not "if documentation".
1
Don't create certificate in keycloak. Just set SSL requirement to external. That way you'll be able to proxy SSL connections using your DNS like cloudflare or route 53. Will share the compose file once near my desk.
1
Don't get him a cat.
He already got one in linux terminal and it can read files too
1
Don't give him anything related to his job/hobby, you'll get it wrong, if he wants something he'll get it himself.
Get him a nice massage (or new socks).
1
Don't overthink it... Just roll your sleeves up and start working on things. Nothing is a substitute for experience.
1
don't think they offer time-series database
1
Don't understand the downvotes here. Good on you for desling well with constructive criticism! 👌
1
don’t sweat it, but if i were you i would try to go at least 4-6 months without calling in sick or being late. your manager will forget eventually.
depending on your manager, you could also disclose that you have bad insomnia. i let my manager know and to my surprise he told me not to worry about coming in a little late as long as it’s not too frequent.
1
Doom slayer rubber duck! Can someone let my wife know too. Haha
1
Drawbacks? ...or challenges?
1
Drone.
1
Dude, why not Ansible? Like... It is far from being a new technology, and makes all that conf and admin work a one time command... Works flawlessly wherever you have ssh access so it's just stupid doing it manually
1
Dude…I’ve been at it for years and no one has found me out hahaha
Just keep at it…if you’re the only one, likely there is no one even available to “find you out”
Just keep chugging along…unless you hate the role, then I’d look around for a better fit.
You got it either way!!
1
Easiest way might be to create a service out of your script. Say systemd or whatever your dist uses. That way you can upload the file changes and restart it (and it will run as a daemon and not in foreground which is your problem now).
That will also take care of keeping it running if it crashes as well. Google how to do that for your dist, should be easy enough for sure.
1
Easy now, OP said he is done with 900 course so he’s not clueless. My suggestion is to open a free account and find someone to show u the ropes inside the portal. Screen sharing live prod will be a Nono.
1
ECS and Nomad are great alternatives to k8s. All three are different in their own way. Learn all three and see which works best for your needs.
1
Edited a routing table. Prod broke because cloud and on-prem couldn’t talk. Unedited routing table, restarted appropriate apps, all was fixed.
Pushed a readme to a branch. Everyone freaking out about changes to a repo. Thought I had accidentally’d changes that oops’d the branch. Admitted to the push. Turned out it wasn’t me at all. Nobody cared. Good times.
1
ELI5: a lock doesn't protects you from theft, but adds inconvenience for inexperienced thieves. A private subnet does the same.
1
employee number who the hell knows of 300,000 here...been an outsourcing resource for 30+ years, I prefer direct deposit so just pay me
1
English, bash, python
1
Environment within a corporate environment. Can initiate inbound connections.
1
Err, there really isn't any "way" around NAT, NAT is NAT.
Do you mean like opening ports to the outside world with DNAT?
The big thing for me is the logical separation, even though the idea of a real DMZ is far gone makes it less error prone. And makes errors less disastrous when they occur.
It's also just plain best practice and it will help someone in the future understanding your setup.
1
Es 1.7 ? That must have been like 8 years ago or more ?
Do you have comparison to their current offering ?
1
ES clusters are a beast to maintain. Been there, not exactly to that scale, but been there.
1
Even better.
1
Even if the IaC is written in C and you don't know C you can extrapolate and read the state and start writting IaC with the language you know.
If you only know HCL and not a single programming language then that's a huge problem and it's not related to Pulumi but you. The idea of a DevOps guy that does not know a single programming language still baffles me.
1
Even the free tier of GitLab might be enough
1
Every rule has exceptions :)
My point was really about not expecting that all one needs is a boot camp and now they’re a devops engineer that can work in any org. There is a wide range of technologies and tools and that’s not going to be mastered in 2 weeks.
If a company is hiring people out of their own boot camp then that sounds more like entry level training / internship process. I’d assume that it’s a way for them to weed out candidates and determine who is trainable for their work. I also assume that this company is a consultancy type org, sending out their employees to multiple customer sites. Or possibly working as part of a large contract.
Experience is experience so run with it. But also don’t stop learning. Don’t wait on the company to train you on how to do dev ops work. Keep learning outside of the bounds of work.
I have interviewed many people whose resumes said they had 8+ years of devops experience, worked on AWS, wrote terraform, etc. when talking to them they barely understood networking concepts, struggled to read and write code, and although they could move around in the AWS Console, couldn’t talk to why you would use any of the services. My assumption is because they were told what to do at every step of the way (e.g. run books ) and didn’t understand how to apply the concepts on their own.
IME DevOps engineers have to know a good deal about everything. And what they don’t know they need to pick up quickly.
1
Everyone knows how hard it is to estimate software projects. When you use words like "simple tasks are taking me ages", there is an implied software estimate there. It's an estimate which you know from experience has a low likely hood of holding.
Projects get done when they get done. Go easy on yourself. Take breaks, ask for help, don't give up. Your velocity will improve.
1
Everything needed for DevOps, SRE, SysAdmin was already thought in Uni when I was there. That was 20 years ago
Algorithms, collaboration, abstract problem solving, presentations, …, cultural differences
I did CompSci/Software Engineering.
What do you feel is missing?
1
Everything should be in git. Including everything need to recreate all your pipelines and infrastructure (e.g. terraform).
1
Everywhere I apply to and that hires wants Terraform so that's what I focus on
1
Exactly what I preach. The most crucial thing is to be able to learn things fast and understand concepts easily.
1
Exactly, it probably is systemd. OP see man systemd-resolved for more info.
1
Exactly, it’s an index, not a durable datastore.
1
Exactly. So if I can introduce as many lower barrier entry stuff as I can that gets us more modernized and automated I'll gladly pick a terraform with HCL over a CDK or Pulumi 100% of the time.
1
Except it's future is questionable with the GitHub acquisition.
1
Except not everyone writes the apps/code. Our company uses python, node, Javascript, java, php, and dotnet core to design websites and mobile apps. My team don't write a single line of code that contributes to the app unless it's like a glue so to speak bind 2 systems together that normally don't work with each other. We would need to know all those languages and what ever new languages the developers decide to use.
Fr us it's makes thr most sense to standardize on something like HCL that the whole team can learn together as opposed to one person really likes python vs go vs powershell/c#/.Net.
It would be a night mare to troubleshoot IAC if everyone uses a different language.
1
Executed rm -rf * in a prod server having the MQ queue manager of a so called "platinum client". Very luckily it was on a home directory and I Immediately recovered almost every file from other servers.
Also those who use putty on a regular basis, I recommend to disable right click pastes.
1
External traffic, Lambda CPU time and log volume can also pile up quite a bit..
1
Failing is definitely the one thing I'm good at it's how I've gotten so far in my career. A wise man once told me if you don't feel like you're in over your head just a little you're not learning and if you don't have to lean on your team for a little bit of help you're not working.
So I document my failures proudly with frequent Branch commits and jira updates.
1
Fair doos, not used it so can't say anything on that but hey, if it's working now, jobs a good un 🙂
1
Fair enough, terraform doesn’t have a steep maintenance curve. Most of the work is already done for you.
The Spotify example, however, illustrates that terraform is not without risks. IIRC they came up with a clever couple of protection mechanisms around branch mgmt but it’s clearly easy to nuke yourself.
Starting with terraform is the right idea, I agree with that. It’s distinct from the others; it’s a lot harder to scavenge for iac after the fact, unlike eg demonolothizing or containerizing.
1
Fair enough. Feels like a lot of redundant work that a manager should've already done.
1
Fair, but then the dev team needs access to whatever tools interact with the DB, there shouldn't be a separate ops team
1
Feels terrible and shameful i know, but seriously don’t sweat it too much. Unless you’re manager sucks or it becomes a regular pattern, it really shouldn’t matter much.
1
Finance to DevOps is an insane jump but goodluck.
Generally I only describe relevant experience on my resume but I still list all work experience without bullets to demonstrate work history. Maybe go back only like 10 years if it’s getting too lengthy.
1
Find a copy of “Working effectively with legacy code” by Michael Feathers. It’s an older text but discusses this idea of seams in code. You don’t know what a library does, so you use LD_PRELOAD or $CLASSPATH hacks to load a replacement/stub library you wrote. Hopefully you have unit tests. If not, start by writing tests to validate your understanding of existing functions/methods.
1
Find the bottleneck. If it hurts do it more often. It's not about the tools. What is the thing that everyone hates doing and automate the hell out of it until it goes away, then move onto the next thing.
1
Finding capable engineers. The average dev can barely wrap their head around the most basic git concepts.
1
First I will be starting with a lab machine that I'll put Proxmox on. I've just set myself the task to make a Playbook to create a VM template image that I can then clone. Next I will create a Playbook to interface with Proxmox to clone and create any VM with the required specs. And then, another Playbook to perform the first time setup and hardening of the VM so it's ready for deployment. By the end of it I should be able to run 2 Playbooks to get a VM.
This already puts me in a position where I cut out all the following manual work:
- Manually clone a VM in the Proxmox panel
- Manually set the network settings, hardware changes
- Manually boot it up, and configure the network by connecting to the console
- Manually SSH in to test it's working
- Manually perform first time setup (updates, packages, hardening etc)
- Manually resize the storage devices to match the provided storage
This feels like a good starting point because then I can create any number of VMs in the lab and pretend they are physical hosts or cloud instances, and learn more about Ansible like setting up monitoring as you and others suggested.
I'll also create a single "master" VM which will act as my Ansible control node and will have the tools needed to do that. I may even create a Playbook to provision the VM as "ansible-lab-control", which will clone the repo I will create for all the learning tasks I'll be doing with it.
I am excited.
1
First one: I've wiped a client's configs. Only took 10 hours to restore by manually by looking through tickets and whatnot to piece things together. Owned up to it to management, which lead to improvements being made in some of our processes. Months later, said server crashed and it looks like I forgot another config so the data drive didn't mount. Easy enough to fix, but it took a while to realize why it broke like that. Client never knew what happened.
Same job, different client. Client asked to halt and pull a server. I halted their other server. I immediately notified client and apologized. This guy was known to be very nasty, but he was surprisingly nice about it and just said please don't do it again.
1
First... I'm wondering if you could explain what you mean by using docker compose to bridge the script and the application?
Aside from that, here's my super StackOverflow answer:
Use [caddy](https://caddyserver.com/).
Their [image](https://hub.docker.com/_/caddy) is dead simple to use, the config is fine, and auto-https is baked in. You can use it as a sidecar, proxy for multiple images, or just as a base to add your app into, depending on the complexity of your existing config.
[caddy-docker-proxy](https://github.com/lucaslorentz/caddy-docker-proxy) is a pretty stinking popular module also, takes a bit to grok it but it also has an [image](https://hub.docker.com/r/lucaslorentz/caddy-docker-proxy) you can play with.
> The plugin scans Docker metadata, looking for labels indicating that the service or container should be served by Caddy.
> Then, it generates an in-memory Caddyfile with site entries and proxies pointing to each Docker service by their DNS name or container IP.
> Every time a Docker object changes, the plugin updates the Caddyfile and triggers Caddy to gracefully reload, with zero-downtime.
Alternatively use base containers or host the script somewhere and pull it centrally as others have suggested, however I'm not sure that solves your use case.
1
Flip the perspective :) As a software engineer I can write my app, my pipeline and my infra (if I even need it) all in one lang. This greatly reduces my need for other teams and IT4IT. Especially if you design your software architecture well.
1
Flow control in a document?
1
Flyway uses a lock mechanism to make sure only one instance is able to access migrations table.
Other apps may fail to start due to lock on the table but if they have correct restart mechanism - they should simply restart to already updated database.
If it worked differently, whole migrations tool would be useless.
1
Focus on your salary source first. If you understand what I mean ;-)
Let me know if you want me to elaborate
1
Folks hate certifications but somehow all of those X Architects have quite a few of those.
1
Fook is a DA? Distract Attorney?
1
For a DDOS attack, I’d probably use Cloudflare. They have a free tier you can use to get started. You need something in front of your dynos to protect it. Express wouldn’t be able to protect itself. Imagine Express is the cook at a restaurant. Using express-rate-limit is like having the cook check to make sure each customer only ordered one item. But it still sucks if there was 1000 protestors trying to mess you up and the 2 customers who want to order are lost. You need a bouncer in front (Cloudflare) to let legitimate customers in.
express-rate-limit is more for rate limiting your legitimate users. You don’t want them hammering your server.
1
For a devops role with security emphasis especially - there isn’t a world in which 2 years of university qualifies you. Financial analysis qualifies you in the sense of working with data / programming languages.
You’re taking this to an even higher extreme. Not just devops, but devops with developed security skills. Even as a fast learner - it would take 10 years of experience to do this truly effectively.
Stop feeding people such unrealistic expectations. You might just be ridiculously talented or a borderline genius - but this doesn’t happen beyond getting very lucky.
1
For an entire environment including browsers etc what about VDI. Something like AWS workspaces.? Access from everywhere, leave apps running, and no need to install.
Personally I would look at vscode and remote containers which workd well with windows WSL. Stick to browser etc in the host OS and use the browsers syncing feature etc.
1
For anyone to help, you are going to need to be more specific -
What cloud?
What kind of tags?
1
for d in `ls -d /path/to/srv/www/var/html/app/php/build-* |sort -V |head -n-10`; do echo $d; rm -r $d
something like that removes the oldest directories from a dir while leaving the 10 most recent.
1
For function, sounds about right.
For security, needs some love. If it’s an API, it needs a gateway and auth. If it’s in the pub, it needs a WAF.
1
For kubernetes, its way better for everyone if we have one industry standard. It's not the only way to do it, but you don't want to have to learn 50 different container tools.
1
For most People screen sharing will be impossible due to ndas. Also, with all due respect, you wouldn't understand shit, lol.
It would seem like matrix to you (at least this is what my normal friends say).
If you want to code, just start coding, become a developer ( I suggest python or golang ), learn basic Linux (bash, etc).
Just write tic tac toe, calendar app, weather app, battleships, cinema seat reserves, etc.
Be sure to make this public (github).
It doesn't matter if your code is shit, you are just starting, it's ok. Most senior devs produce shit code anyway.
What matter most is that you want this.
Cheers!
1
For SDET they learned Git, SQL, some Web stuff, Java, TestNG, Jenkins, and some other stuff as well. I don't know the extent of their knowledge of these topics but it got um the spot.
1
For terraform Ive been often asked which structure and approach would I implement when trying to match X or Y use-case.
You build infrastructure differently depending on what project it will be.
- something with static amount of envs
- something with dynamic amount of envs
- something with variations in environments
- which additional tool could help with that
Take home tasks for refactoring to make ifra scale better as company will grow. Make it more maintainable.
About any cloud its good to check whether candidate (cloud engineer) would be able to setup whole account from almost zero (as most companies deliver landing zones).
So networking, on-prem connection, deployments, exposing services, what to watch out for to not pay too much for no reason. Depends a lot pn the project.
Dont ask shit that is nor related to the work someone will be doing if its not needed in near future.
1
For the ephemeral environments do you copy the whole system or do you have some kind of profiles to create? Do you use some tooling / platform for this?
1
For the frontend, this is built into AWS Amplify (and other tools like Netlify). For the backend, we are still on route, but I expect the following:
- AWS CDK to manage the infrastructure and spin it up
- Have pre-warmed environments that can be rotated in so that spinning up is faster (I haven’t tried this approach in practice yet), or
- spin up only systems that changed and connect other services from a default testing environment
1
For the reasons I listed they look very comparable.
1
For the record I agree with you, and I meant to reply to the same comment you did.
1
For whatever its worth, if you like your current work environment I'd bias towards staying.
Personally, I'd treat the offer's stock-based salary as imaginary. You may not choose to stay in the company long enough to realize the equity, and you may not be able to offload the equity when you really need the cash someday. You say you don't want to leave the cash behind, but the cash doesn't exist for some years.
For me, working for a company who is paying a real cash salary about $40k higher (plus some indeterminate part of equity), whose environment you know you like, and whose mission you believe in - that sounds like a better deal to me.
But I obviously don't know your life situation - this is the best 2 cents I can offer from a Reddit post :)
1
Former Rogers employee here, I didn't work on any of their core infrastructure but other products that touched related billing, identity, entitlement, payment and fraud systems.
Things may have changed as it's been 5+ years since I left but they were very waterfall, i.e. rolled out changes in giant batches. We would have 100's of people on a conference bridge at 3am going down an implementation runbook with 1000's of steps, many done manually. There's every possible platform running, from extended support IBM mainframes (read: obsolete), windows NT servers, old Nortel branded gear, even saw some Novell changes, to the latest and greatest serverless + k8s Multi hybrid cloud deployments.
When shit hit the fan mid deployment, it would be chaos. People talking over each other, yelling at others, people breaking out in different languages. A giant shitshow in general.
Dev/test environments don't exist for everything due to the tech stack sprawl and some shit they don't even know still exists can be running. I kid you not, we were looking for storage to dump old decom'd physical servers and we came upon a giant room full of HP stackables and data acquisition units complete with what looked like a manual switchboard still powered on. Based on the layers of dust and cob webs, nobody has touched that room in a long time.
1
From a work perspective, it depends on your company / management. Ideally, you should be able to be honest that you've got some sleep issues, you're trying to deal with them, and you need the flexibility in your work to manage them appropriately. If you have shitty management then you need to come up with a lie they can't call you out on.
On a personal level, find a better doctor. You're having major sleep issues which are affecting your ability to function in society, that deserves proper treatment. Whether it's medication that allows you to sleep in the same pattern as the rest of society, or a doctors note which states your need for flexible working hours to accommodate a medical issue, you deserve proper treatment rather than being pushed aside.
1
From my experience with different teams, what worked best was to be use case driven, i.e. try to deploy an app in an automated way, expose it in a secure way, load test it (to check HPA and cluster-autoscaler), get comfortable with debbuging tools, etc... . If you have time, look at other K8s distributions (OKD and TCE for instance) to broaden your view.
1
From my experience, ClickHouse is the best database for storing time series data other than numeric values (for example, events and logs). It provides outstanding query performance, which scales linearly with the number of CPU cores available to ClickHouse. It also scales to multiple nodes. While TimescaleDB and InfluxDB can also be used for storing non-numeric time series data, both of these databases require at least 10x more time for executing heavy queries over time series data. See [these benchmark results](https://clickhouse.com/benchmark/dbms/). Additionally, these systems usually require much more disk space comparing to ClickHouse for storing the same amounts of time series data.
As for the best time series database for numeric data, then this is definitely VictoriaMetrics. It is based on ClickHouse ideas, but specifically optimized for numeric time series. See [this article](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for details.
1
From my experiments with bash to do unit tests, I'd recommend to not do that. Unit tests are great, but if the environment does not support it, it's more in the way.
What I settled for is to use Node.js (I just like it, Python does the job equally well) which calls shell commands as needed. But I can encapsulate those shell commands in a (for me) sensible and capable programming environment which supports tests better. That alone makes testing so much better.
That said, I use tests and Node.js on a regular basis, so I might have simply a preference to use something I know well.
1
From my understanding of rate-limiting, I thought it would be important in terms of DDOS attacks. I've just dug into it a bit deeper and it appears that I can implement this via Express with **express-rate-limit** so it's likely not the issue I thought it was.
1
From my understanding, GTM is just fancy DNS. I think this guide from Nginx describes essentially a replacement for F5s FTM using open source components (specifically, Nginx Plus, but I imagine you can configure community Nginx using the config files the same way, just not with the fancy API and stuff): [https://www.nginx.com/resources/glossary/global-server-load-balancing/](https://www.nginx.com/resources/glossary/global-server-load-balancing/) Super good question, though. I'd be interested to learn what others bring to the table.
1
From your answer, I’m wondering if your architecture is right. I obviously don’t have all information in hand but using nginx ingress as job manager doesn’t sound right. Have you consider adding a layer for that purpose?
Like nginx ingress -> api server -> job manager.
By job manager I mean something like hangfire which is dotnet only but I’m pretty sure you can find one that fits with your stack.
1
From your comment it would seem that you’ve not had to program with terraform to deal with slightly complex situations.
1
Fuck yeah.
1
Fuck. Man nothing has ever given me more goddamn anxiety than managing an artifactory environment.
1
Fuckin great, another one.
1
Fucking using Big Data tools for datasets that fit in RAM on my laptop.
No Steve, we don't need a Hadoop cluster for your analytics workload that accumulates 100MB of data per day.
1
Functioning pull requests and the web ide. It is annoying to pull a whole repo just make a one line change in a repo I will never be in again just because it had a Unicode character. The merge request flow is really buggy if someone leaves the org and it's still an approver your request will poo itself
1
Generally, I use Python, Yaml and Bash, but mostly python to generate and parse my yaml since I get annoyed at different yaml settings and parsing rules, especially for larger projects. Plus I generally wrap most of my scripts in a docker container to share my work so team mates can use it without having to really deal with dependencies.
1
Get him 3-4 HardDrives and a note saying buy your own Synology... My friends did this for my birthday one year, greatest half-gift ever. If he doesn't already have a Home NAS that is.
Can look for craft-y classes he might like, for me my love of devops comes from a love of building. Leather, Coffee, Jewlery making classes all fit that bill.
1
Getting early stock, also getting 160 with a 10% bonus puts you around 176 TC (plus whatever your stocks turn into, and that's really going to be the difference maker)
Are you working on a product that will become a billion dollar unicorn? Do you want a piece of that? In 5 years that could be $10,000,000+ and an early retirement, or the freedom to work for whoever, whenever, and wherever you want.
If the added compensation is honestly life changing now, I'd say sure, take the new job. But honestly you sound like you're in a good place, especially with the full remote and the reasonable compensation package.
You can always make more money, you can never make more time.
Realistically what do you think your current RSU's will be worth in 4 years? Is that going to make up the difference you'd get while waiting on the RSU's from the new gig to vest?
1
gh client:
https://github.com/cli/cli
1
GHA is good it just seems to go down every few months.
1
Github
1
GitHub actions and argo is my favorite combo so far. At least if you're doing kubernetes.
1
GitHub actions are not intended for services and they will always time out or you’ll get a surprise bill and/or your account locked for abuse
1
GitHub actions let you supply your own runner containers, which would be cheaper vs full saas.
1
Github Actions?
If you're already using Gitlab, the transition into Github Actions should be "easy" or at least, look quite familiar.
1
Github Org level secrets are a blessing with actions.
If you're asking how secure it is though, you're already down a bad path in most circumstances.
You should be more concerned with what the secret has access to, and how it's being used. Lock that shit down as best you can. Github is one-way with secrets, and they'll attempt to censor the logs from printing anything out, but it's probably not perfect.
1
GitLab is lightyears better than Bitbucket. GitLab is well worth the money, if you can make the financial argument
1
Gitpod period. There are quite a few other solutions like that and all you need is a terminal.
1
Give it some more time. Try to grasp one thing at a time. Make sure your manager understands your needs.
1
Give Levitate a spin https://last9.io/levitate/hosted-prometheus
1
Go study https://roadmap.sh/devops and create a path for yourself. Learn git.
1
God, this reply is amazing.
1
Going with k8s is a good choice if you're mostly using cloud providers for deployment.
But don't try and do it all in one big bang. Set yourself a target of migrating one service (a web service would be maybe easiest).
Do the terraform for the k8s cluster, and the k8s manifests for the deployment/service/ingress.
Sometimes it's best if the people doing the web app code are also responsible for containerising it. But either way someone should learn how to package up a docker image and make some pipelines to get an image built and pushed...
1
Golang <-> Performance + Kubernetes
Python <-> General Purpose / API
Bash <-> Scripting Pipeline Steps / Sys Work
Java/TS/JS/Kotlin/Groovy/C/C++/C# <-> Depends on what the Developers use to write apps
1
Golang, bash and some python. Then the languages of the apps of course. Typescript, C#, Java, SQL and what not.
1
golang, jsonnet, terraform
none of which do I have much skill with yet...
1
Golang, python, bash mostly. Rust if you fancy
1
Good bot. But don't you dare...
1
Good call, thanks - I did set the nginx controller to 1 pod but I didn’t for the application. I did confirm that data is only being successfully read by one instance but I’ll give that a try.
1
Good companies will ask for your Github portfolio and open source contributions -- you could have had Kubernetes production experience and it might have been terrible and not up to date with today's standards. Now all orgs are using Service Meshes, Gitops, and security (devsecops) -- its a fallacy that 8 years ago that experience is relevant today. Back then Devops was running a Jenkins CI/CD job to deploy to a server and spending hours why it failed and with no easy way to rollback
1
Good luck on getting that free custom written tutorial on how to host a timesheet application from S3.
If you want to showcase some skills, then aquire some skills.
1
Good points! The problem is that the ROI on investing into people/processes/infrastructure to prevent such a major outage and it's repercussions only be measured (and from this standpoint, justified) after the outage (not as a preventative measure) and by then it's too late...
I am wondering why telecom companies have such a monopoly and is there a way to overcome this and let smaller players (maybe some new innovative tech companies) rise? As an added fruit-less monopoly might decrease cell plan costs-phone plans in Europe are a fraction of the cost compared to here!
1
Good to know! Thanks for finding and using TimescaleDB!
1
google
1
google dude... devops tool #1
1
Google is optimizing for machine cycles, unless you're super niche, your startup is optimizing human cycles. different goals need different paths.
1
Got any benchmarks.
1
Got the 24Gb model
1
Grafana Mimir hits your requirements (cost-wise, too). I work for Grafana and I’m happy to discuss further, just let me know!
1
Grafana Mimir might be a good option. It scales to at least 10x your requirements.
1
Great advice, also Ansible let you settle with the idea of: I want this done,idk how you achieve that,do it. Is a little vague what I say. But I hope you get the idea, manifest is pretty much how things is done now.
1
Great advice. And funny you say that as I just spun up a dev/test subscription. That’s a great idea. I’m gonna try to get a bicep or terraform file for existing servers to spin in up in dev environment and be able to test changes on. Thanks again.
1
Great feedback, thank you. I agree there is a ton of content out there.
Mentorship is great, but we already have a shortage of senior engineers. Hence why I'm trying to educate people remotely :-p
So I don't know. I'm still very much conceptualizing everything. I see a lot of botched implementations of stuff at IT shops because people don't take the time to zoom out one or two levels to see how stuff is put together. That systems thinking approach is missing. I can teach that.
My education was actually in sustainable energy engineering, but I happened into IT. So I guess I was at some point one of those people who just needed a decent paycheck, too. I brought with me a deep understanding of system and scientific data analysis, though, and was perplexed at how un-scientific workplaces ended up being.
I guess one part "culture" and one part "science" is what I'm trying to get toward. I hope that armed with those tools, people will want to learn more.
1
Great now who is willing to teach a team of ops people who barely can do scripting that isn't willing to put in the time to learn?
DSL lowers that barrier and puts everyone on the same page and makes it easier for one team member to help another as well as plenty of examples of how to accomplish something because everyone is working off the same language. For example I was using winappdriver to write some UI tests. It supports .Net, python, and java. The environment I was writing required that I write my tests in dotnet core, i was a contractor and that's what they wanted. I've never done something like this before and I'm not a programmer. I'm an ops person who can read code and get an idea of what's going on and write scripts that use programming concepts (devops) in powershell and python.
When I went to go look for examples of how to do things in dotnetcore for 90% of the content I found was the super basic "hello world" in a perfect environment. The actual useful stuff was all in a different language which. So for someone who isn't a developer writing unit tests, that was an overly frustrating experience due to lack of information in the language I had to use.
Another advantage with DSL is its written cleanly. If I look at an ansible playbook it's instantly easy to see what it's doing unless you just create really complex playbooks. Code can be neat too but harder to ensure it stays clean and that everyone isn't going to try to reinvent the wheel.
I have one direct report that for some reason decided within the same script to send email via powershell and then separately using the .net library to let the user know their password reset. The idea of in memory operations is foreign. No really, they use CSV as their memory space. So they will take a list of users and manually create a CSV import that, modify it, output to a csv, the reimport that CSV which is the same information as the first just flipped imports that again. I've tried explaining so many time that it's just adding a bunch of code, examples of they can accomplish the same thing in memory much more easily but they can't wrap their heads around it.
1
Great post, thank you! Question: Doesn’t Solution v2 result in every invocation being a cold start? If so, is that actually better than a retry-able periodic failure at scale?
1
Great reply. I especially like the multi-region redundancy idea. Writing the code to support that is definitely a good idea, even if I don’t actively deploy that way at first
1
Great stuff man. I recently started getting into ansible and kubernetes and wanted something like this. Still quite a newbie, but I have probably an easy question.
I was wondering if there is an easy way to install this on a ubuntu vm I have. Instead of using localhost:port I would like to use ip:port.
Thanks for the help and creating some cool stuff.
1
Great suggestion with valuable insight. At least we can begin here
1
Great work, just went through it and would love to start using it.
Since you're making use of API to pull the cost of resources, can we make sure our data is secured and not compromised with any of our information?
1
grow up dude, I'm just asking for an opinion!
1
Grow up. Read the rules.
1
Ha! I've been an Oracle DBA for 15 years and never knew this one!
1
Had two lines of code to change in a UI. But they made the feature dependent on a telephony system and the UI a monolith so that I had to have it passed down as a child. It took me an day or two to get code fixed in the monolith so the data was passed down correctly and some time to learn that the telephony data was changing and that I couldn’t test this anywhere but prod. I told everyone on my team and the business that this would likely cease to work when the telephony system changed over. I tried to comment this in the code but they external PR team had a no comment role.
It broke. Took days for them to figure out where because they didn’t understand the layer it was breaking at but we told them the first day what it was. Some people didn’t listen…
My team was cool about it because I was very vocal about this being an issue. Maybe obnoxiously so. Some people in my department were not
1
haha
1
Haha Spotify wasn’t on the startup side of this equation, but the FAANG. Yea there’s no S in FAANG but you can be assured they’ve got a massive engineering operation. I’m saying if _Spotify_ has trouble with getting the basics right, you can expect it will take you significantly longer as a startup with very limited bandwidth.
1
Haha you’ll be okay. Have fun and fail fast; communication is incredibly important
1
Haha. I have websockets working properly with my Flask app. But the library I've just started using is PyWebIO, and it doesn't support websockets through Flask (that I know of). So I'm forced to use Tornado. It's not serving a primary function in my app, so I'm fine with the implementation.
And I just implemented a token authorization, so that's sorted as well.
1
hahaha, fear not, upvote/subscribe to [https://github.com/infracost/infracost/issues/1814](https://github.com/infracost/infracost/issues/1814) for updates on IntelliJ
1
Happy to answer questions, I’m kinda proud of it. I’m in the process of cleaning up the code and culling any secrets that found their way in, and I hope to open source some parts of it eventually.
I started with Ubuntu server and docker compose. As I got more complex, I migrated to virtualization with KVM on Ubuntu server. Finally I landed on Proxmox.
I believe all those iterations were the right choice. I developed my docker stacks on raw Ubuntu with ansible, migrated to KVM when it became relevant to utilize ephemeral systems, and then migrated to Proxmox when I started automating things that didn’t fit well with Ansible. Finally migrated to the swarm when I got enough hardware to run it
My point is that it’s all up to what you’re trying to do. I wasn’t trying to build what I have now at first, I was trying to run some things at home. It only became what it is today because of my curiosity, willingness to learn, and an honest sense of satisfaction when I get something working for myself. I’d never touched ansible before this. I had never seen a terraform resource. I thought Proxmox was a cat walking on the keyboard. It’s amazing what’s possible with relatively little money but it has taken A LOT of work. If the current state had been my original goal, I’d have burned out in a week.
1
HAproxy works as well
1
Hard disagree with the k8s part. Managed k8s is pretty great regardless of your scale. It does a great job hosting just a few apps as well.
1
Harry Potter for IT managers.
1
HashiCorp has a site dedicated to learning their products https://learn.hashicorp.com
1
Have done that one before. Had a SVN repository mounted in a virtual machine (well, chroot). Went to remove the chroot files now I didn't need them but didn't unmount the repo first. That "-r" part of "rm -rf" sure does love to recurse.
Managed to recover the files, but not necessarily all the history!
1
Have to ask – which one did you not yet use at scale?
1
Have you been looking on garden.io
1
Have you check CircleCI?
They integrate with GitHub and gitlab is in preview
1
Have you considered hosted IDE solutions like
* [gitpod](https://www.gitpod.io/)
* [codespaces](https://github.com/features/codespaces)
* [coder](https://coder.com/)
1
Have you done this manually yet with token auth?
1
Have you ever heard of "Infrastructure as Code"? I assume we're are talking about YAML files or similar kind of configuration. Why do you think that this is not code?
https://en.wikipedia.org/wiki/Infrastructure\_as\_code
1
Have you forgotten how to ride a bicycle after you got your driver's license?
1
Have you heard of group think?
1
Have you looked at AWX and putting that into a runner container?
1
Have you tried [Kubernetes](https://kubernetes.io/)?
Unless, I misunderstand your requirement, I think you are looking for a tool to manage and maintain multiple containers. correct?
1
Have you tried Portainer or OpenShift? You might find this [blog post](https://www.portainer.io/blog/portainer-vs-rancher-vs-openshift) useful. It talks about the differences between Rancher/Portainer/OpenShift as a Kubernetes management platform.
1
Have you tried to use CDK (or pulumi for that matter) with Go?
Some languages are better for that case than others. I really like Go, but I don’t want to write Pulumi/CDK in Go. That’s a pain.
1
Have you tried using the Docker Desktop?
1
Having a bootstrap system that sets up the minimum required for the rest of the system to start is perfectly normal. After the final pivot happens getting rid of that system is a must because of it's privileged status within the system. A good example when looking at provisioning kubernetes clusters for example is running a local kind cluster to use the cluster api to provision a proper HA one then pivoting the local system over to the provisioned system.
1
Having worked at many places ranging from start-ups to Fortune 100's over 20+ years, I've found that environment (people) matters more than money.
Also, the money looks better with your current employer anyway. Based on what you posted, you will be at $192k in 2 years without having to worry about the company's performance (stock options at the other place).
1
Having worked at multiple places specifically on the components that run tens to hundreds of thousands of kubernetes nodes, I can assure you - I am fully aware of how complicated it is. This is literally what I do. You are certainly entitled to your opinion but I don't think it matches with the actual reality of doing the thing given my personal experience. I fully support you joining any massive infrastructure team and absolutely proving me wrong, hell, I would gladly use your tool and test process if you can do it and keep the support issues low enough to make it actually viable.
1
HCL - IaC (80% of my work)
YAML - Pipelines/SOPS/Docker (10% of my work)
Bash - Scripts (7% of my work)
Python - Custom Lambdas (3% of my work)
1
HCL, bash and powershell primarily.
But are they programming language? It’s rather scripting and configuration. I code in whatever programming language fits best with the tooling and dev teams I’m integrating with.
1
HCP Vault's pricing is [https://cloud.hashicorp.com/products/vault/pricing](here). It's their managed platform in the cloud.
But you can self host for free and manage it yourself.
1
He edited his response, once he realized he was wrong.
1
He LOVES building things. Maybe a woodworking class
1
Hear me out.
A good mouse/keyboard goes a long way if he does not already have one.
1
Heh. I remember getting to a point where I wasn’t afraid to burn it all down, it is very liberating.
Of course for me, it was just my homelab, I had 3 servers I was making a proxmox cluster out of, AND trying to learn Ansible. I’d make some big mistake and then just wipe them all and start again. Ansible made it super easy to bring them all back.
1
Hello you've been doing nothing but spamming it for the past few days no one gives a shit
https://old.reddit.com/user/Ganja2233/
1
Her man, I've got acronyms for your acronyms.
1
Here's a use case for you:
Your a supporting ops team. You are in a shift left phase, so you're working with developers who write APIs on how to do basic debugging within kubernetes and Terraform. Rather than force every dev to install the tools across different MacOs, Windows, WSL, and Linux environments and managing that nightmare (plus the pushback from devs about installing strange tools) you develop a docker container.
Now you can train every dev on the entire provisioning life cycle and CD pipeline so they are better equipped to service their own software, and to design software that works better in these environments.
Now the docker container can be shared company wide to each dev team, and can be in a repo that can be updated for security issues, and tools added and removed as needed by the developers.
1
Hey /u/DJ_GRAZIZZLE, thanks for your feedback. Could you let me know which parts of the post you found cringeworthy or disagreed with?
1
Hey brother, not sure your living situation, but you can’t beat fully remote these days. There will always be more money for our field, but we’ll never get our own time back which being remote provides. (ex. work traffic).
1
Hey I'm just finishing our migration to EKS hosting microservices from ec2 instances running a monolith.
Maybe when we switch production over, I'll do a linked in post too!
Kubernetes was definitely fun.
1
Hey thanks for the post! Been really having a burn with AZ104.
1
Hey, I’m that another person who banged my head against AWS China service discrepancies. It’s different enough to make your life miserable - there is ALWAYS something different enough to have a need for an independent version of infrastructure code to be deployed.
1
Hey, just wanted to say you guys saved my ass once. I walked into a company with custom "rollup scripts" in PL/pgsql that performed half assed segmentation and sharding, and always failed.
Migrating to timescale db reduce my toil by about 50%
1
Hey!
Thank you for your input and your experience!
Yes i'd say i am sure that i don't enjoy development since i have released it even from personal projects that it was more torture than something i enjoy.
Although if it is a simple script to automate some commands on linux or very silly python script its okay but thats more than enough. So the rest of your work doesnt sound bad and maybe they are thing i can become good!
Haha yes say that again about the cultural spectrum of people in a company and in a team.
I mean nothing wrong with the nerds and in many ways you could say i am nerd in some things but jesus some of these guys in my team!
I feel like we are from different planets! The worst is the super morning persons lol They sleep even from 10 evening sometimes. (I am coming from a country in Europe where okay staying up late is more normal i think lol)
1
Hey. I have not read it, but I've had a lot of colleagues that recommended it to me. It always sparks interesting discussions.
If you want change in your company, you need to convince management and in most times upper management to force middle management...
If you are not there, anything you read is a guideline to asks durring interviews at your next company to ensure they do things as close to devops as possible.
1
Hi All
I created series of powerpoint, keynote and google slide templates for most common usecases (over 40 different categories: like startup pitch, business process, scrum, education, etc), the goal was to use ready made formats to convey ideas in meeting rather than creating and spending time to design the presentation.
I believe this would be very helpful for consultants, teachers, enterpreneurs, students or any one who has to prepare presentation in day to day work.
You can download the templates here:
https://kumarvivek.gumroad.com/l/presentation-full-access
Additional discuount for REDDIT users using code: reddit
I kept the price as low as possible (actually 3$ with above code) to keep it affordable as possible, but has to charge a little to make it worth the time.
If you can't afford or you are a student, comment below and I will send you a free copy.
Hope you like it, and if you have any question comment below.
1
Hi, i'm working on a project on Github that should be everything DevOps. It is an Incident reporting platform that exposes the `days_without_incidents` for Prometheus. It is made with Django+DRF API and a React Front end client, production running in Kubernetes with its own PostgreSQL database pod. The project board is filled with ideas and work to do. Here are the relevant links.
- [Project Board](https://github.com/users/chazapp/projects/1)
- [API](https://github.com/chazapp/incidents-api)
- [Front End](https://github.com/chazapp/incidents-front)
- [Kubernetes](https://github.com/chazapp/incidents-k8s)
Feel free to study, fork, and even submit pull requests to it. This should give you an idea of how a DevOps guy develops, deploys and maintains projects in production.
1
Hi, just wondering what were the reasons/benefits of moving away from azure app services to k8s? I was just at MS Reactor last week (based in UK) and their engineers were suggesting that their serverless products - Azure Container Apps - were the way forward (further abstraction on top of k8s)
Historically, my biggest challenge has been getting the organization to formulate a clear plan for the team or business unit such that candidates can judge whether they will be stimulated by the work they will actually be asked to do, as opposed to the work that others in the organization are already doing (and which, by extension, new recruits will not do). E.g. "We're on the bleeding edge of ML." seems usually to mean "You'll be cleaning the data for the actual ML guys."
1
Hmm
They provide a feedback?
Even spamers?
I did not know
1
Hmm maybe a pig instead
1
Hmm maybe is a bit too much, true that, although is a good way to approach an stack. You could get rid off the load balancing and the Redis bit maybe.
1
Hmm, maybe you can include these tools as well - Sentry, Sematext, Atatus :-P
1
holy shit are you me
1
Honestly I say go for it! Hands on experience with people looking over what you do is a great experience to learn things! There are other resources on ways to get into DevOps but those are more "teach yourself" routes.
I have a buddy that did a bootcamp for networking and it treated him well - he is now a senior sys admin with no degree making about 130k a year! It really just comes down to how well you retain knowledge and being able to prove you know what your talking about. ;)
1
Honestly, CircleCI is one of the worst vendors I've ever had the misfortune of working with. Their costs are obtuse by design so you spend more than you expect. Their support is basically 0/community unless you pay 10k a year for a premium support plan. The product itself is actually not half bad when it's not either down for maintenance or just plain broken. There hasn't been a single week this year where I haven't had issues with builds failing due to their platform. Half of the time their status page doesn't even show issues. I'm hoping to switch off their awful platform soon as I have some free time on the roadmap :D
1
Honestly, getting good applicants is the first challenge, and then convincing the right people why your recommendation is the best candidate just because they donthave as many qualifications listed on their cv.
1
Honestly, I've been thinking a lot about vertical scaling and runtime optimization. Modern CPUs are so powerful now that you can probably handle massive amounts of traffic if your application is written in a statically-compiled low-ish-level language like nim or rust. Maybe even run postgres on the same machine and communicate over UNIX sockets. Then just replicate the db and have a warm backup ready to go in case the main server goes down? You can probably go a long way with a tech stack like that...
1
Honestly? It's an absolute must have. You may not need it for a long time, granted, but understanding why you don't need it and when you might need it is exactly what most teams get wrong and exactly where disaster looms.
I like to always compare software tools with hand and power tools. If you're a wood worker: maybe you don't need a table saw, or a jointer, but you must know what they are and what situations they might be helpful or overkill. You must understand the pros and cons of investing money, time and space in these tools and be able to correctly assert the pros and cons of this investment.
We can even go simpler: if your work requires you to to use nails and screws, you must know how to use a hammer and a screwdriver, and understand when each one is appropriate or not.
1
Honesty is typically the best policy. In my experience, I would have an open and honest conversation with your manager. If they are experienced enough, what they'll do is get you some help or give you some time to get up to speed on kubernetes. Listen, I've been doing this for 20 years and even I struggled with that platform. Best of luck. Keep trying and you'll do great.
1
Hopefully you can find somebody freelance and/or who’s willing to let you shadow on personal projects, but in general, anybody who actually works for a company isn’t going to be allowed to let you shadow them like that. Your best bet would be to look at the tons of resources that have been posted here, tutorials, YouTube videos, etc.
1
Hoping he is good ... Or it will be challenging
1
Hot take: It’s not an unreasonable algorithm: People who update LinkedIn are more likely to be looking
1
How about CKS? You'd learn to secure Kubernetes, but also some primitives re container security, auditing, etc..
1
How about doing the impossible and being a little more kind in your comments -> mindblown
1
How about installing a helm-chart without any resources, but with a post-install hook job that runs kubectl from within a pod in the cluster?
1
How about using an external and bootable SSD, that has Linux installed and all your projects. That way you can boot it up on pretty much any computer.
1
How are you using CUE? I have not seen it much out in the wild.
1
How can those teams pull commits from bitbucket if its in private datacenter ? If it has not public endpoint there would be no way to contact it from AWS :( ?
1
How did that work out -- if you don't mind me asking?
1
How do you provision that Kubernetes cluster? Let me guess....terraform.
1
How does that fit into "master is always in deployable state" mantra ?
1
How long you got? Off the top of my head…
1) tried to save time during a manual Linux deploying by SCP-ing a file across multiple servers. I thought the file was the same on all 4, turns out it wasn’t. As soon as the copy ran, 3 of the servers tanked on the graphs and the remaining one started screaming under the traffic. Thank god for backing the originals up beforehand.
2) pushed a change to our CICD pipeline code that broke all releases just as the US developers were coming online.
3) shut down the wrong DB instance due to having too many sessions open on various machines. Thank god automatic failover worked as designed but I caused the DBA to spill his coffee as he was making database changes at the time and thought he’d broken it. Bad timing on my part.
I would always recommend immediately owning your mistakes, broadcast them to your team and grab as much help as you need to get it fixed. Be open and responsive to feedback and criticism, and be proactive in fixing documentation and processes where needed.
It also helps if you have a track record of fixing stuff when it goes wrong when it wasn’t your fault before you start breaking stuff, goodwill goes a long way!
1
How many dsls would you like to learn/remember during your career?
1
How many rams/vcpu do you give to the vms and do docker need a lot of rams?
Pretty sure those specs you mentioned are pretty cheap-ish
1
How much better offerts did you get after CKA - can dm me.
1
How much of it reads like an appendix sans the actual book.
class Logger
--> .format(event) => Object
The end. Much helpful.
1
How so?
1
How to check that tho, evercompany I checked on glassdoor or gowork had terrible reviews (all outsource)
I 100% agree with you. This should be a next step. If we have more than X directories delete the oldest one.
1
I actually did an interview recently where they asked about recommending moving a monolithic application to kubernetes. I told them that was not a great idea. Applications need to be written for containers and quite a lot of the time containers just add additional and unnecessary complexity. So its not always great idea to arbitrarily move to a containerized system.
I know the interviewers didn't like that answer. But that's ok with me. I don't want to work that sort of project especially since they didn't have a specific need for this solution and its just over engineering something that doesn't need it.
1
I actually did this -[https://www.sandervanvugt.com/rhel-8-learning-resources/](https://www.sandervanvugt.com/rhel-8-learning-resources/) in 2019. you just need the basics. This course is more focused on becoming a Linux admin.
1
I actually found a 3D printing pen I thought looked cool and would be a good starter to get into it. He has an entire work shed with all his fun things and tools so I thought that could be a cool addition?
1
I actually just finished reading it this week. I tried reading the devops handbook first, but I find going through a fictional narrative got me more invested in the process rather than just jumping straight into the theory. I personally am still very junior in my organisation, so I can only really push ideas and hope they get taken up. One large takeaway I found is that I needed to better understand what each team values and make suggestions for projects that would add to this value. If you back things up with data, then it’s difficult to argue against logic.
1
I agree and I honestly am often on the fence about when a separate repo is warranted. I found that for build scripts I like to have a separate repo so I can update the scripts and manage the versions without hitting any build triggers. On the other hand I do like the idea of having a "package" organized in one place, and you could just exclude certain updates from triggering the process ...
Either way source controlling and versioning of all this stuff is absolutely critical in our space and if that is not being done should be the highest on any DevOps engineer/lead/whoever's list.
1
I agree test lab it and once you have prototypes in place you can roll it out. But... You could just learn what the best practices are and write up RFPs requests for a contractor to come in and implement things for you. Then you learn that structure and maintain.
This is really a CIO decision point since it will impact all of the systems. It's an IT cultural shift since you code differently for micro-services that will live in K8s vs megaliths that are stand alone.
1
I agree very much that traditional DevOps (If one can even say that because it is still a somewhat newer field”) is becoming less needed. I think this comes from the expectation for developers to “own” their own code from code to deployment. However, I think the DevOps engineer’s place is still much needed as the central glue. Also because of our specialization we can spend time developing projects to improve the infrastructure as a whole.
1
I agree with this sentiment on idempotency. OP, think of your image as a deployable build artifact. Just like you'd download a binary of a tool to run on your machine, your image should be the form of your Python program to run on (I'm assuming, based on the terminology here) Azure, which is effectively your execution environment. If there are problems, you narrow down your issue to that specific combination of code+machine-image.
1
I am 20, live in chicago
No college degree
Bare Programming skills
Certified in AWS CCP and working on Developer Associates
Gunning for a DevOps internship by the end of this year, at the very least would like an accountability partner if someone else reads my comment
1
I am a fresher and just started out in my current company.
The worse I have yet done is send my own picture instead of the data I was supposed to send to a SFTP on-premise server.
Learnt it the hard way not to use personal computer to handle company's work. Worst part I cannot delete it. I just hope nobody sees it in the millions of files in there.
1
I am aware
1
I am currently doing CEH. It has [19 labs](https://ilabs.eccouncil.org/ethical-hacking-exercises/) included and the exam itself is just choose one correct answer. Some of the labs are quite basic, but some like session hijacking and wireless hacking were pretty fun. Some of the labs walkthroughs are on youtube. As former firewall admin, the n Microsoft sysadmin and now unix env. devops engineer I found it very platform neutral and focused on practical examples, in every scenarios it shows how to do all tasks both with windows tools and also using parrot OS. Not everything was new, but I enjoyed the concise set of information on different areas and set of tools that comes with the labs, some of which I didn't not know existed.
1
I am deeply interested in a platform as a service that scales to zero. Open source developers could use it. Like Serverless but for servers
The problem is that containers take memory and memory isn't cheap and VMs take 10-30 seconds to come on.
If desktop and web apps and mobile apps queued operations for when the server comes back online you could have very cheap apps but also have the scalability if you need it.
I don't know how fast Kubernetes can start a container. But I think it's under a second
1
I am interested in scalability challenges. I want to build a reusable open source infrastructure in a box that gives you preconfigured advanced scalability without much effort. It needs to be cross account multi region multi availability zone. Mazzle can bring up a Kubernetes, consul, vault, packer and repository server, bastion and upload packages, install SSH keys and has Prometheus and grafana. There's so many moving parts and it's incomplete. I need to add backups and so many other interesting things. It's not ready for other people but the pattern works.
I wrote mazzle and platform-up and mazzle starter which is a precursor but it's not in a state for reuse yet.
http://GitHub.com/samsquire/mazzle
http://GitHub.com/samsquire/platform-up
http://GitHub.com/samsquire/mazzle-starter
1
I am saddened and slightly angry to read this. No sir, customers know exactly what they want but the problem has always been that IT people do not know how to extract that information from them. If you expect them to write down their requirements in your terminology without input from you then I am sorry to say you'll get it wrong.
Do not try to get them to use your terminology but rather ask them "what would you like to achieve", "how would you like to do it", and crucially "Why do you want to do it". i.e. understand things from the customer perspective. Then confirm your understanding via paper/whiteboard prototypes asking questions as you go until you both are clear about what they need to do.
1
I am starting 2 year program, not a uni more bootcamp/projects type of thing, and DevOps is an option for the last half a year. The CS degrees in my countries are not great, so it doesn’t compare to them in either way
1
I am vegan.
1
I appreciate it. I always try to approach projects and problems "code first" - so I don't currently have any infrastructure that won't be under Terraform control. But it looks like a nifty tool for future jobs where I do need to bring everything under TF control
1
I appreciate the input! I've just been stressing lately as I would love to do DevOps the more I look in to it, but it's really hard to tell where to start exactly. That's why I'd be considering the bootcamp, mostly for it's organized structure that I can bounce off on and keep adding to the knowledge.
1
I appreciate your feedback!
1
I asked a question on this subreddit and Postgresql of scaling postgres storage space. Ideally you want a scalable storage backend. That's why I like how neon uploads to S3. I thought I could build my own ceph cluster and use postgres on it with timescaledb for a database that scales in sheer size. Could use partitioning to speed up queries for old data. Someone from timescaledb commented and said that they're working on work to upload old data to S3. They support retention policy to delete -- I wouldn't want to delete events from the past.
I looked into rook the ceph operator for Kubernetes and the postgres operator and I think I prefer to run databases and stateful things outside of Kubernetes. Maybe I shall try it out but I feel I would understand the stack better if the database was outside Kubernetes.
Running postgres on NFS is not recommended.
1
I believe Terraform: up and running is popular
1
I believe the guidance I provided is better than yours.
The work is in the design. He doesn't have a working solution. He has a vague idea. It's like asking for a "facebook clone". All you need is some code, right? The missing pieces are too numerous to constructivly comment on really. He doesn't know what he doesn't know. How does anyone find out what that is? You give it a go.
Shall we place bets on when he will show us his time sheet demo app? The work is in the design and there is still a long way to go. You can tell because no one has designed it for him already and no one is about to. No problem to ask but IMO my answer is the correct one, the most constructive. I'm not just some schmuck throwing acronyms to a noob to make myself feel smart. I'm giving the correct answer.
1
I believe there is room for both if you have laid out your internal environment as a service for your devs properly and remember that Terraform is an infrastructure as code tool, not a configuration as code one. Keep your high level infrastructure for each environment as terraform. Carve out each environment so that appdevs can have the rights to deploy end to end their apps without interfering with other app team deployments. Let them use the cdktf or pulumi (or other) tooling for their deployments with appropriate pipeline as code being centrally managed.
Keep in mind that infrastructure as real code is still subject to the same ongoing maintenance that you likely already have to undertake with Terraform alone (never ending updates of providers for ever changing cloud APIs and such). In 5 years when you have moved on and the company shifts to some new language how supportable will your code be? How easy will it be to shift to something else if it is all baked into your code base?
Great question, I'm loving some of the responses :)
1
I broke container access to S3 for our prod service. Tests were failing but no monitor alerts. 2 hours of debugging and we found a race condition between two services referencing one another in Cloudformation. Apparently updating their stacks at the same time set the S3 value to "None" 😑. Not necessarily my fault but I had no reason to deploy them at the same time, other than being overly confident in our pipeline. We now have a check in the pipeline job for values set to "None" or empty.
1
I brought down production just yesterday by running a load testing script without realizing it called an API which soon hit daily quota :D
1
I built that with 3 years of experience. You are beating yourself too much.
Some companies cannot risk and look for proven experience.
You can lie but it will backfire on you.
Good way to train is setup your own home lab. Im running a small smart home setup - helps a lot to test new technology and solutions.
You have to show innitiative to join some teams.
1
I came here to down vote anyone who said something other than bash. Good job.
1
I can assure you hosting your own jenkins will be far worse.
1
I can't let you see my screen for the reasons others mentioned but if you want to try out some unpaid internship with my mentoring on a project I'm working on + potentially working on some open source stuff - you can find me on Discord at https://devopscommunity.org to discuss more.
1
I can’t say comparatively how expensive it is. I’m in a large enterprise where the centrally managed GitHub+other tools charge back is 5 times the cost of using Azure DevOps.
We also lucked out in that we have a bunch of users with MSDN licenses, so they’re covered + they add self hosted licenses for us.
1
I caused 1.7 million dollar database outage because (my team) forgot to replace a RAID card battery.
I know it was 1.7 because it was a startup and our financials were…slim for a couple quarters (before our buy out…which almost didn’t happen)
1
I certainly will. I think hosting our own Jenkins instance will give me headaches.
1
I contacted my manager and apologized for not informing him on time, I sayd that I prefer to not specify the private details and that I should be back tomorrow.
What you wrote is very helpful, I see that I will need to tell them more about what happened even if it's not easy.
About the sleep diary, this is a good idea and it might be helpful. I will do that. I didn't think about this.
But I am not sure about going back to the doctor and if it will help. I don't really like to talk about my issues.
I have an idea on why it happened.
I need to be more careful at keeping the same sleep schedule and try to sleep enough but it is very easy to forget about it and realize too late.
1
I could use some help envisioning a modern AMI lifecycle in a regulated environment. Currently using chef, but would like to move to a more ephemeral strategy. I'm happy to type out more or have a conversation if this is in your purview.
1
I couldn’t handle the terrible writing, I tried to read Phoenix and Unicorn. The CEO of a startup I worked at was really into them. I quit that place after a week. The writing reminds me of what would pass for a plot/story in porn. It was soooo lame
1
I definitely advise getting something setup. Remember, you need to monitor the application in addition to kube layer. Since it’s azure I’m assuming u probably dont need to do hardware layer :) I haven’t used dynatrace yet but if they have a way to do queries that you have saved from previous incidents that can be scraped or run then you can probably connect it to Prometheus icinga nagios etc.
Good luck!
1
I deleted all employees in production (luckily we had backups). I also had some update sql queries where I forgot to add a where- I definitely changed the way I ran updates after that- writing the where before the set. (We were a small team that did development and prod support), as we grew we all lost prod access- as it should be)
I also accidentally released an update to an installation of a desktop app- that when you uninstalled, deleted the entire local machine node of the registry- instead of just the application. Instant blue screen. Oops. Luckily it just affected a few users.
My favorite oops was I improved the processing of a sql query- that used to take 12 hours- I got it down to ten minutes. But the side effect of it processing so quickly was that it flooded our database replication (this was mid- late 2000’s where each of our 35 global offices had a local sql database to optimize connections). This flood brought all global updates to a crawl, which begun to affect integrity of everything else as updates queues up in other applications. We quickly realized what was wrong and I created a solution to “pump” updates- we still ran the query in ten minutes, but instead of updating a million records In One transaction, we broke it into groups of 50k, which was much quicker to process the transactions for.
Fun memories.
1
I deleted all the subversion repositories. Nearly 100. All of them.
15 years ago and I still remember.
It was a script to upgrade the repo format, which it did… then deleted the old one, AND the new one.
1
I deployed a Kakfa pipeline, enabled it in one end assuming it had been started on the other end. Data was streaming to nowhere for about 15 minutes. Monitoring blew up… my manager was like “why didn’t you turn on the pipeline” and I was like the ticket didn’t say anything about it and I was never trained. We spend 4-5 hours pulling the data in from csv files backed up to a bucket.
This was a startup and we didn’t have resources for a testing environment so changes often went straight into production. I was conditioned to be VERY careful after that and when I finally changed companies to one that had entire environments configured and ci/cd I was like “you mean I can just break everything and nobody cares?” Lol
1
I did not get to ask a ton of role specific or technical questions because this first interview was this an HR rep for the company. They are a little smaller, 50 people and an overseas call center for their product(not the happiest about that but I know nothing about what that could entail.) Good note on making sure there are ample days off where I am not expected to honor some type of SLA during my time off unless absolutely needed. I am fine with being a one man team, I have been doing it for \~7 years now for the same company, I have just hit a wall with the resources I am allowed to play with and enhance CI/CD and monitoring/centralized control where I am at.
1
I did that as a career for ~10 years.
You pay me enough it’s absolutely an option.
1
I did the following two which I found really good:
* Learn DevOps: The Complete Kubernetes Course
* Learn DevOps: Advanced Kubernetes Usage
1
I did the kill all thing too on a AIX box running the lotus domino server for the college. Fun times.
1
I didn't even know this was possible...
1
I didn't know there was a name for this! Literally what I have to do every day at my job lol
1
I didn't mention google at all. I said to do it. He felt he could do it with a few lines in Terraform, my suggestion is to try it and see what you get. The activity will reveal what he doesn't know, and that is, at this point, what he needs to know. He is a long way away from worrying about a firewall yet.
It isn't just about the code, though that is an import part of an APP yes. Is a beginner going to pull off user imput forms and report displays on a static web app you suppose? Can you? You still need to design the infrastructure. You need to know first how to host just a static page on S3. That's not "so simple" until you know how. Along the way you learn about how to setup S3 and all the questions you must answer to do even just that. But he doesn't know even this part yet. It's not just a few lines of code anymore than development only requires a keyboard, mouse, and editor.
He very new at this, you can tell. You need to get started with the basics.
1
I didn’t at all make that Tripwire connection. Thanks for that!
1
I didn’t have a developer brain until around 5000 hours. I wasn’t good until 10k. Ask anyone (and me) that hires new CS grads and they’ll tell you they can’t think like a good developer until 1-2years out. Worst devs I ever hired were post-docs with theses in data structures. Could do all the math and big-O notation but couldn’t make a website with a table and list of items.
Don’t sell yourself short. Experience trumps class every time.
1
I disagree with arch. It will require you to learn too much up front and bog you down. I use Ubuntu server and now Proxmox as a hypervisor.
I’d suggest finding a simple project and focus on implementing it with dev ops.
For example, a PiHole for home dns. Use GitLab CI to spin up a PiHole instance on whatever hardware you have available. Bonus points if you use Terraform, even more bonus points if you use a VM, and double score if it’s an ephemeral instance with persistent storage. That should give you enough problems to solve and things to learn to keep you busy. You’ll need to do a little bash and a little CICD to get it running, and then you’ll figure out what problems you have (stability? Your home network uses it for dns so you want it solid) and you’ll have to learn how to solve them. That problem solving is core to devops - the problem is slow turnaround time for features/fixes and such, the solution is devops. Using the tools you have to solve problems is the point, not automation or Linux. Focus your learning in that direction and you’ll be alright no matter HOW you choose to learn it.
1
I disagree. Terraform can become pretty unreadable due to hcl. And there's no unit tests available to save you from that.
1
I do have the same question. Some good guidance would help selling people on the process.
1
I do not have deep understanding of any of those things you listed besides databases and programming. But I was thinking maybe learning k8s would help me fill in the gaps.
1
I do the same, but AWS picks up the commits without GitHub actions.
1
I don't always break things, but when I do it's either just on my machine or everywhere!
1
I don't even know why you'd put any of your cloud resources into a CMDB, especially FaaS.
The point of a CMDB is that it holds information the resources you own so that you can effectively manage them, and any cloud provider will natively allow you to enumerate all information about all your resources whenever you like.
1
I don't know how cheap the bare metal you're buying is. But when you're worrying about operational costs don't discount the fact that there are costs to managing your own servers in your wage. If you can spin up compute on GCP or AWS or Azure that costs 30% more. But you end up having less outages, less maintenance cost, etc it can make sense.
I don't know your situation so can't answer that for you though.
1
I don't know if it's taught in class or recommended, but several universities are doing this through online platforms like Coursera, EdX, etc. UC Davis on Coursera seems okay, but I think they get crushed by the offerings of tech companies.
1
I don't know if you can do this on OPs gitlab, but you can setup a proxy on sonatype nexus or (I believe) artifactory and directly pull the image using your server tagged version.
This avoids the 3 step process of pulling down, tagging and pushing up an image.
1
I don't think it's a formal name. I made it up while writing the post based on what I found myself doing. I'm sure others have too, but I didn't see the phrase documented on Wikipedia or anything like that.
1
I don't think the API source code is to blame there, it's written in DRF and Django does not leave much place for optimization on the developer's end, everything is managed by the framework. I've removed CPU limits on my containers and relaunched the test suite with much better results, i've wrote them in a different comment if you want to have a look.
1
I don't think the current solution is too bad.. they provide the modules to automatically update their cmdb.. yes, it might slow down deployments, but they get the data they want and they maintain the code for it, so it's no additional work :D
1
I don't understand how you're amazed that doing this on AWS is somehow that much easier than on premise. It sounds like you need a web server, a nosql database and python. Each of these is simple install-and-configure on a Linux system.
Of course there are other challenges and drawbacks going that route, but I don't see a big difference in the inherent complexity.
1
I don't want to be more rude than I am already, excluding the imposter syndrome real probability... How would you execute a "strategy" without understanding the technical side...?
1
I don’t doubt that adding a few buzz words to your profile would attract the bees. :)
1
I don’t have even remotely close to that amount of data, but the number of times ES has shit the bed and lost data on me is crazy. Last week I had a connectivity issue between nodes for a few seconds, which apparently was enough time for ES to decide everything is fucked up and it better just delete everything. That was fun to recover from. Luckily I had learned from the time it did the same about two weeks ago, so I had a way to get most of the data back.
1
I don’t really know much about the Azure container apps, but we were basically using the App Services to directly deploy .NET core artifacts and the services than communicate to each other through API. There were three main advantages for us in moving to K8s:
1. Everything is more manageable in the code. We could manage all our service configs and resources through Terraform.
2. The latency for requests went down quite a bit.
3. App services still incur downtime unless you use staging slots. Staging slots can be automated through the pipeline but they’re still a pain in the ass to manage.
4. Kubectl gives WAY better feedback and metrics on instances.
5. It’s actually been cheaper for us.
1
I doubt that Influxdb can handle such volunes
1
I feel like role DevOps is also decreasing because fullstack engineers are now doing devops work too.
As a senior dotnet fullstack engineer myself i pretty much do end to end stack
1
I feel like this is the most versatile stack of languages.
1
I feel you. Cause that's where I am heading right now. 9 years of work experience but most recent is as a cloud tech team lead. Right now I am about to change my career to a hands-on DevOps role where I feel like I will be not be able to accomplish the tasks. Maybe the imposter syndrome but I'm kinda afraid right now..
1
I find it scary that so many folks relate to the idea that DevOps just entails scripting. I work on SRE and most of my job, apart from one, all required coding every day. Most of those teams also had applications we maintained.
1
I forgot the WHERE statement on an Update script against one of our prod databases that updated over 50,000 rows of PHI. Luckily a back up had just ran less than hour before and I was able to restore it that same day with help from one of the senior devs. Scariest day of my professional live
1
I get that. Maybe I'm just stubborn, but I rather would still do it all myself bit by bit.
1
I get the whole selling it to the business thing, but once (if) you sell it how do you actually efficiently copy petabytes of data? Like technically. Maybe its not that big of a deal, I've never tried to duplicate that much data.
1
I get what you're saying and I disagree completely... If you need to teach people about tools the last thing you would want is to add more complexity to their setup, which a container will contribute to.
You should also not control the dev environment for developers... This is just creating unnecessary toil for your team and their team.
1
I got it. Well, may be there exists some correlation, but I don't see it.
I've made more than 50 interviews last year and understood one thing: if you are not Google, and don't have a month for interview - all those questions shows nothing, except that candidate is able to discuss them. One or two hours is not enough to test skills, and from your answer I got that you are not testing them.
Unfortunately I still don't have a receipt how to hire right person. It is lottery.
May be you could provide some interesting questions and explain what they show to you.
1
I got the 24Gb model
1
I guess I can chime in because I was in a somewhat similar situation. So i graduated college hating programming. Like I didnt understand shit back when I had to do C++ (and tbh I still always stackoverflow everything). Then I had an internship where I did a little Linux/bash and LAMP stack and eventually started as a "systems engineer". At this job I went straight into Docker and k8s and Terraform.
Then as time went on I did some work on creating brand new Jenkins pipelines, and guess what I was writing Groovy, which is essentially Java. Then I got a chance to work on some Python and eventually wrote a Flask API and deployed it on EKS. I actually realized I sort of like developing, at least basic things with Python. So my point here is, maybe you havent tried it yet on a project you enjoy to know whether you truly like it or not?
That said it does sound like you just dont enjoy it, and thats fine in all honesty. What I stated above was like 20% of what I worked on.....the rest is writing a ton of Terraform code, k8s/helm files, and bash scripts. As for the culture aspect, well that just happens. Ive been in industry for little over 5 years now and after 3 jobs Ive met both nerdy and more laid back people. I just try to connect with coworkers on the things we have in common at end of day. It was always interesting to see the spectrum of people in the same industry, going from the straight edge people, to those down to go out for drinks after work, all the way to the people down to smoke haha. I say thats something youll see as you get more experience.
1
I guess you set a default small inventory and override that as required.
1
I had almost the exact same experience, but now I'm almost a year into the startup. I think three things helped me out a lot:
1. Just start working on something. It's okay if you get half way into it and decide to throw it away because there's a better solution. I threw out a lot of work but you have to take it as a learning experience, you're the expert in the company and if that's what it takes to understand the nooks and crannies, then so be it.
2. Talk to the lead engineers A LOT. You'll realize in startups there's so much undocumented knowledge and you have to talk to the engineers. It's not annoying, it's doing your job.
3. Get into the habit of learning. When you get strategic you step away from the "cool new tech" that comes out, but when you're hands on again it's relevant and can be applied. More importantly, it makes you want to try new things and think about how things run and if you could improve them.
What you're feeling is normal, and you're a good employee for feeling like that.
Good luck, feel free to DM!
1
I had read the book and in sort of a surreal parallel universe application we hired a CIO who hired a crony who basically tried to effect similar kinds of “shift left” thinking into the org, at first with implementing ITIL - not a terrible idea because we had no change management at all before then - then with trying to “do” Devops with no specific plan or roadmap or vision for what that meant in the org. Like he’d point at me and say “Hey this is so and so and they’re going to be doing Devops here” which was news to me.
He was a really really terrible manager though. Incredibly awkward and draconian “leader” who seemed to relish tearing people down and just canning people who didn’t follow along with his program- which you kind of had to intuit and figure out yourself since it was never articulated plainly or openly to the two large teams he took over running.
And yea I still think I have mild PTSD being “managed” by this guy.
1
I had similar situation. 1. It is just a job, nobody is perfect 2. Calm down, it is the right direction - you are being paid to learn for 8 hours a day. You put yourself out of your comfort zone to grow. 3. You are not stealing money or anyone's lunchbox, I mean you are doing your best and working and learning. Stress less, do as much as you can, you will get comfortable soon. 5. Do not worry about what people think about you too much. Do not fear asking questions. Your manager is also new, he is not in position to make changes now, use the time available to integrate yourself and calm down
1
I had the same issue with vault, ended up initializing a "standalone" vault with integrated storage backend using some bash scripts, using that I can deploy my production setup and at the end migrate the vault state to the prod database and remove the standalone vault.
I figure you could use the same approach for other dependencies.
1
I had to do something like this for a customer. Basically, multiple times per day they would send the DBAs a list of queries they wanted to run which could range from editing a single record like "Hey, we got a duplicate order and we need to drop this row" to queries like "Hey, we need you run this crazy big queries tonight after hours because it's going to run for 4 or 5 hours."
So what I built of them was a git repo where they would submit SQL statements to be run. Each statement was put into a folder with a metadata file that a BASH script they would run would create. In this metadata file, it would say which server, database, what time (anytime, after hours, scheduled), and what type of query (data fix, report, application change).
They would need approvals from their team leader and a DBA. Once approved and merged. At that point Drone would kick off a build that would post all the data into a MySQL database via a GO API with react frontend. Each PR was submitted to the database as job with a scheduled start time. Then I had a little worker process that ran on each database server that watched for jobs that matched it's server name and had a scheduled time of in the past. Worker process handled updating the job status IE running, failed, successful, and storing the output as a BLOB in the DB.
Yes, I know this is overkill for this task but I built this tool to automate a number of tasks like requesting NFS shares, adding LUNs to servers, requesting VMs, rebooting servers, etc.
1
I had to start a homelab to really learn, or find side projects at work where I could use a new toy (read: tool/skill).
I’ve learned a bit of CloudFormation and such thru work side projects, but I’ve learned more with my homelab than anything else. I’m running a single old repurposed desktop as a virtualization server, and on that I’m running a 3 node docker swarm cluster, a smaller vm that basically handles the GitLab runner, and an LXC container for Pi.Hole. The swarm runs ~10 stacks with about 30-40 total tasks.
If that sounds like a lot, well it really is. So I needed to automate it. So I learned ansible and terraform. I got tired of running everything local and wanted to eliminate that impediment, so I learned Gitlab CI.
My point, albeit buried in there, is that I can’t learn new skills for the sake of “professional development”, but I’ll (hyperbolically) become an expert in a week if the tool will help with some project I’m doing for myself. So my advice is to start hosting some stuff on an old computer and running it at home. IAC, networking, devops, it’s all relevant in a lab environment, but it’s all personal so you have free reign to explore, try it and break it, and to play around to see if you prefer Ansible, chef, terraform, or whatever else and then explore as deeply or broadly as you like.
1
I hardly touch bash. But I do use a tonne of yaml, JSON and python
1
I hate pictures above or below.
I build a table of 2 columns.
Text description, and directions on the left, screenshot of action on the right.
1
I hate the idea of pulumi credits and how it basically is a middleman for all your infra. I’m already depending everything on terraform, you want me to use another thing that’s in charge of everything? No thanks.
I don’t wanna be reliant on one tool for my entire structure and I want terraform alternatives, but Pulumi is not a clean alternative and people need to stop promoting that it is.
1
I have 3 relevant aws certs and have been interning at a cloud consulting firm and basically learned skills directly relevant to devops like Terraform, Boto3, Linux, Python, etc.
I don't know if that means much, but hopefully I can show my knowledge through personal projects.
1
I have a feeling if I can get myself up to speed on Ansible and Docker, and the whole concept of containerisation, CI/CD, and all the Kubernetes pre-requisites, I would be able to tell then if Kubernetes is really needed or worth the trouble.
1
I have a feeling you will find the majority of answers in this chart, so there's that.
1
I have a shell script:
create database servers (pg, mysql, mongo) & do few sql dumps
But most of apps do db seed (create db, data and if needed some kafka topics)
1
I have always dreamed of putting this together - It would be my DevOps dream come true, but no luck so far :(
1
I have been through a lot of them:
Git VCS:
Bitbucket is fine for source control, imho some if the tools are less then ideal.
GitHub, is in a league for it self. It does everything, and it does it really good. Documentation, usability and functionality is all there.
Gitlab, is a lot like GitHub it can do a lot of the same things, but not as good.
Azure DevOps, It's like it wants to be Jira, bitbucket, GitHub, Jenkins, gitlab etc, all at the same time, with a hint of vendor lock in, with missing fundamental and a horrible user experience.
Cicd:
GitHub actions are awesome, you can pretty much trigger pipelines on anything that happens within your repository.
Jinkins: just don't do it
Concourse ci: that's awesome to, you can lock version and prevent new versions to be deployed downstream, everything runs in containers, and a nice UI.
Gocd, also nice but missing pipeline as code, it can be done, but it's not ideal
My pick would be GitHub.
1
I have been using SOPS for encrypting git repositories. It offers multiple providers so you can use it with AWS secret manager for IAM permissions or with PGP keys; you choose.
Very good solution for gitops tools.
1
I have been working in the cloud, primarily AWS for a while now. I use lots of SaaS tools too. Containers, serverless, K8s and ECS. I do "all the Devops things".
I primarily use Go, Python and bash for code that I build and maintain in addition to whatever language(s) the core/application developers and data science people use. Right now, that's Ruby and Typescript but I've done a bit with Java, vanilla Javascript and C++ too. I have worked in places where some/all of the backend and analytics code was Go or Python too.
This sums up what I've used over the past 6-7 years but I've used a bunch of other languages as well.
I would love to really learn Rust but I don't have any solid use cases for it currently and it's a bit hard to learn. Given that I only have a finite amount of time, I probably won't learn Rust until I'm using it for work.
1
I have found that all of my managers have been reasonable when I’ve asked if there was a way I could devote 5-10% of my time to learning and growing and for me that translates to 1-2 sessions per week of a couple hours in the morning working on training topics.
Maybe set yourself up to do the “100 days of code” challenge and find some meaningful projects at your level to work on. If you do it for an entire 100 days - you will grow immensely.
Sounds like you know enough that you got the job and you work in technical environment so don’t forget to remind yourself of your accomplishments and what you’re good at if this is maybe a bout of imposter syndrome that’s set in.
1
I have K8s on my resume and I don't even know what CKA is lol. I wouldn't worry too much.
1
I have made an open source tool for this, feel free to check it out. [https://github.com/mbecker20/monitor](https://github.com/mbecker20/monitor)
1
I have not but I have seen tf modules that seem to point at being compatible with working with AD DNS. I know you can at the very least do it with powershell.
1
I have not seen this in any of the three startups I've been in.
1
I have one customer that uses it and it is just a pain.
Code reviews and merge requests are just not as fluid as GitHub or GitLab. Things are often hard to track what actions need to be taken and even harder to validation that the action items have been completed.
Half the time the UI is taking up half the screen unless I’m on my 27” monitors.
We thankfully run out CI outside of Bitbucket so I haven’t had to deal with that.
1
I have over 10 years of experience (environments at a scale of 100s of servers). I'm far from having "seen it all" but I consider myself a senior DevOps/Cloud engineer. A year ago I joined a new company and the shear number of new tools--new to me--was overwhelming. I've always had to learn new tools and systems, but this something else. Even though the problem areas overlapped with my background, I felt like I was drowning. It took me a good 4 months to feel like I had my footing.
K8s itself is a beast and three weeks isn't *that* long. Keep going, maybe take a K8s course, and find ways to leverage your strengths. I don't know if this applies to you, but I'll mention it because I see it a lot. How well you're performing isn't *just* how long it took you to complete the task at hand. It's easy to fall into a mindset that you need to pick up the task and run as fast as you can. Take time before starting the task to fully understand the problem and it's context. As you're working through the problem, take notes. When you're done, groom those notes into documentation, but keep your own notes for reference. This is a subtle, but good way to bring value to the team and fight off imposter syndrome.
1
I have quite good experience with Jira as it also helps to visualize metrics quite well.
1
I have read a lot of comments on Kubernetes. Many seem to love it, but many also hate it. And it seems a lot of people are using it in situations where it might be overkill. From my perspective, I am attracted to the idea that it manages everything and decides which nodes to launch containers on, handles high availability and load balancing, and that is incredibly attractive if I can possibly attempt to reach that point.
I think it's a good idea to begin with Ansible too, and to replicate the current setup in pieces and make it re-deployable easily, restorable, and then work from there. Eventually if I can upgrade it in chunks there would be a point where moving to Kubenetes would seem far more feasible.
1
I have read it and started a bookclub in my organization to try to discuss the ideas and how they impact the organization I'm in.
I don't consider it to be a theoretical time waste at all. In my current organization, we have used it to understand the nature of work, identify where the work is coming from, and are taking steps to reduce not only unplanned work but to also streamline the intake process. We discovered recently that most engineers have far more work in progress than they can complete in a sprint so we started to refocus our efforts on certain kinds of work.
In essence we are working to reduce work in progress and improve the flow of work that is coming in on those processes. We are also starting to work on more metrics around the kind of work we are doing, the systems we are impacting, how long it takes for us to deploy, who the work is coming from, and a variety of other metrics that try to help drill down to where most of our time is being spent.
While the book isn't going to offer any strategies to fix the problems they describe, the important thing is to understand the three ways and manage the work.
1
I have read The Phoenix Project and The Unicorn Project, and it is essentially my life's mission to instil these practices in every company I work at.
I've been an engineer now for 14 years, so I have more experience than most, and I've become *very* good at forcing even huge companies into agile working practices, and it often looks like the brutal slog that you find in particularly the Phoenix Project.
I get fired a lot whilst I'm doing this, but I succeed more than I fail - I've driven the kind of change you see in those books in companies that were in an even more dismal state than Parts Unlimited, and you really have to have a mountain of experience and be the type of person who just refuses to be constrained by "the rules". You need to have people screaming in your face that you're going to get fired and be fine with it, because you know that adopting an agile, DevOps mentality is ultimately going to make everyone happier (and it always does).
Great books - but they depict an idealised and particularly easy-to-change organisation (which may sound odd as the organisation is written resistant to change). The approach in them is absolutely realistic, but boy do you need to have a hard nose to do it.
1
I have runners on both a Linux VPS and a Windows server. Though one thing I don't like about Drone is not being able to see runner status unless I check `docker ps` (or services.msc on Windows). (Their forums seems to see this as a feature rather than a bug, but I'd like to be able to see what machines' runners are pinging home to my drone-server.)
What you seem to can't do is cluster runners on OSS, but individual machines seems to be fine. That might be where the confusion is.
1
I have seen everything between you build it you run it to pure IT roles with tons of automation and cloud stuff.
At this point I would probably define it as a core skill set (container orchestration, ci/cd, iac etc.) and everything beyond that depends on how your employer defines DevOps.
1
I have the same problem, see an older post: https://www.reddit.com/r/cicd/comments/qmeybc/best_practices_for_running_macos_cicd_jobs/. Still don't have a solution deployed but I'm experimenting with Darling. For linux, I use libvirt + qemu + cloud-init for VMs and docker for containers.
1
I have this saved on an Amazon list for him!!
1
I have tried Terraform, Pulumi, and AWS CDK at the same time for 3 ongoing projects. It was a trial run for moving to IaC in my company and Terraform won. For the same amount of requirement complexity, Terraform code ends up being the simplest and easiest to maintain. A new dev could immediately learn it in a single day and be productive with it. Code changes are easily reviewed against the plan change which make merge/pull request a breeze.
I do aware HCL has some quirks like the "if" condition which needs to be replaced by "count", non conditional lifecycle block, weird loop syntax, etc. But it still ends up more maintainable. The typescript code that we used for Pulumi and AWS CDK ended up looking weird because some devs can't wrap their head around how the imperative nature of the programming language will translate to the declarative nature of the infra, which I can empathize.
1
I haven't. Mainly because we don't use Go. Language support is definitely better in some and worse in others. Can't disagree with that.
1
I havent seen anyone else mention this, but what about tests? I dont want to be the party pooper who insists on this, but it belongs to professional software development. In addition to that, handling some edge cases and taking care of the details will take just as much time as everything else, so I wouldn't underestimate implementing and setting up the application.
1
I heard Azur is pretty expensive. But Thank you for bringing it to my attention I will have a look at it.
1
I honestly don't remember how. Probably messing around with terraform or (more than likely) the AWS console...
1
I honestly wish my team had done this for our early POC. Managing ECS + K8s cluster has been a nightmare for velocity.
1
I inherited a jenkins server that used a ton of plugins and was an all around mess. I have had pleasant experiences with Jenkins before but that was from competent teams. This however was a damn nightmare. I migrated over to github actions and life is a bit more manageable. I would have prefered to get over to gitlab but its not in the cards.
1
I just came across agile a few days ago and I was looking for a simple explanation of what it could be I came across a lot of technical videos and got more confused.
1
i just don't like nginx documentation. at times it's too abstract without an example result
1
I just fell right now more interested in DevOps, but just does not want to lose my experience that I got working as Data Analyst, as It not very applicable for DevOps in general. And everything felt the same as yours situation. Basically, first I got interested into Data Science, then I got way more interested into Data Engineering (building pipelines, working with Big Data), but right after that I felt with DevOps.
1
I just looked this over. I was already a developer so maybe not basic language/api-use/etc parts and I didn't do any certs, but most of what you suggest is very very close how I did most of my AWS learning, including dogfooding my basic personal site.
That was 5+ years ago now. I still have a billion tabs open and go down rabbit holes all the time. Especially now; I pivoted to google for a newish job. I'm not expecting this to really change tho, ever.
1
I kicked off a restore of a 3mo old backup from Gerrit (source control system) including all repos and related data to a pre-production environment to run some tests... only it wasn't pre-production. About 2mins into the process I realize what I did. The system is hard-down and I'd wiped 3mos of work from 3000 developers. Thankfully, I had a nightly backup that was only 27mins old. I went through the logs and found I'd lost exactly one commit since the backup. I restored the 27min old backup and had the dev re-push the one commit. All was good.
1
I know k8s really well so personally I'd almost always want to use it as a deployment target.
1
I know most people feel that way, simply because people find it very surprising when I'm able to make these changes against massive corporate resistance.
All I can say is that the way your company "wants" to change is irrelevant - every company change "top-down", and all companies innovate "bottom-up" and against a bunch of resistance.
People who try to innovate "top-down" show their inexperience, and they show they've not understood the lessons of the book (particularly the Unicorn Project).
1
I know right?
It's pretty amazing it still works for a lotta people it seems.
Also weird most of the comments never mentioned groovy.
1
i know that but i dont enjoy programming and it takes me double the effort since i never liked a absent or lets say mathematic and analytical way of thinking...
1
I know the question was mostly about tools, but it's difficult to recommend anything without knowing what kind of agile you will be doing.
The word "agile" has been buzzworded to hell and back and depending on the shop it may mean anything from fanatically following the Agile Manifesto to a more corporate 'waterfall with iterations and no requirements' version.
IMO the step 0 for you would be to talk about it with the business and/or whoever signs the paychecks to get a buy-in for the process change from their side. Once that is done the step one would be picking the new work methodology that would for your org, and from there pick the right software (technical solutions can't solve people problems).
If there are no specific requirements from management besides "make it better", i'd recommend starting out with [kanban](https://en.wikipedia.org/wiki/Kanban_\(development\)), as it so simple people probably have been doing it without knowing the name. Once this gains a bit of a traction, you'll probably know what works in your particular org, what does not, and where to go from there. After a while it might turn out that a recurring meeting to check up on tasks in progress and demo tasks that have been completed (read: effectively have iterations) would be a natural extension/improvement to the organization of work.
I'd also recommend checking out The Phoenix Project book and checking out /r/ExperiencedDevs - either one might be a source of inspiration on how to solve a particular problem in your org.
As for the software itself, the company i work at uses Rally. It works okay, but it would likely be an overkill for your use case - one of the simpler alternatives (jira, trello, or heck, even a shared OneNote page) would be a better fit.
1
I know.. but xdrive.. plus I loved it when I test drove it. And there was one ready
1
I like Harness for both CI/CD. For my personal projects, I use Drone.
1
I like packer for building AMIs, and you can write a GitOps-based pipeline that requires PR approvals before merging to implement specific controls. You can also lock down the build context using the IAM features of whatever cloud provider you are using. Are these useful ideas or am I barking up the wrong tree?
1
I like that attitude -- take a chance -- some of the best devs have come from non -trad backgrounds like music or very persistent learners --- having holistic system design thinking is helpful
As for the expectations of the hiring managers-- that can vary by a country mile
1
I like the article, but I wish I could see just one blog or technical post in the kubernetes space that wasn't an on-ramp to a product
1
I like the hackernews who wants to be hired/who's hiring posts. Lots of good talent there. There are also some sites that specialize in remote jobs to check out.
But honestly, recruiting in this environment has been very difficult. I think it's really important to internally evaluate what types of candidates you're looking to attract at the most detailed level possible (strong coding skills vs sysadmin skills for example). This way you can target candidates who would be a strong fit and at least try to minimize risk.
1
I like this concept, however i dont enjoy development so i'd prefer to find my place without being a developer/coder
1
I like your initiative and also agree with u/davetherooster's feedback.
Let me know if I can help in any way.
1
I like your pragmatic viewpoint — that certainly makes sense.
Thanks for the perspective
1
I liked the Phoenix Project, but the sequel The Unicorn Project annoyed me with the fake dialogue.
1
I live in the UK and its fully government funded and no there's not a devops job role in my current company. I will be looking at job roles somewhere else and it will be entry level first. So what particular areas did you focus more on as you were transitioning as you mentioned you were also in IT Support.
1
I looked at the source code and it looks to me like if one of the inputs fail during initialization, telegraf will quit. This can't be overriden, even though the mqtt input plugin does have the ability to reconnect.
I think you need to setup your system to automatically restart telegraf if it quits unexpectedly. I would also recommend, in case you have multiple inputs, to use a dedicated instance for mqtt so that the rest of your monitoring isn't affected.
1
I love how supporting everyone here is.
Don't give up mate, it's normal. Just take 2/3 hours day to study and you will be fine
1
I love the advice and support. And I believe you to be correct in your assessment.
however I've finished a python, SQL, the most extensive excel certificates.as well as several projects I probably could've got belted for.
I also finished the number 1 azure 900 certification course. so I think I'll pass the 900 but how do I get thy foot in the door at an entry level..
1
I love this
1
I love this analogy. Then I talk to Java Expert (me as a DevOps) and he understand JVM less than I do and doesnt know all coding principles as I do (which I mostlt automate either way).
So we compared our sample projects.
Mine was twice as fast and a lot cleaner and his was well Expert Developer level. Hehe Expert..
So yes back to the topic.
Quite a few of real DevOps with years of experience have mile wide and mile deep knowledge.
Bit its not many of us and we mostly work as consultants and not d2d opsdevs (coz most companies have opsdevs not devopses).
1
I love this idea. Question: do you think it would be weird to gift a NAS diskless, without the hard drives? I am working on employee gifts (with \~$300 budget) but worry that it might come off weird if it's not the full, completed-bay NAS.
1
I love this one
1
I mean sure I can say I made a fact-based opinion about how certain tech will ruin my company if you just ignore the one fact i presented about a company using that tech and oh that company is by far more successful than any company i've ever worked at and oh there are no other facts.
Why are you mad bro?
1
I mean... I've been running servers on straight Linux machines for a Long time. Dealing with updates, blue/green deployment, _host hardware upgrades_ et al gets pretty annoying.
I had to deal with this recently. Even though I thought I set up my systemd service for my website alright, it never came back up. So after an extended period of not touching it, I've gotta ssh in and figure out the incantations to start the different services in the right order. Except this time I lost my ssh keys when I installed a new distro after the last one crapped out. Shame on me, but...
Terraforming all that and deploying new services (matrix, plex, docker registry) with confidence is a huge boon.
1
I mentioned this about morning persons since its a sign of someone who likes to has a very strict structure in his day. Which its relative into the "mathematical" mindset of someone made to be developer...
But yes i agree that stereotypes its a bad guide always...
I agree also that i should discuss the upcoming months more about it with my manager like a team or project that i click more..
Do you think i should be open also about me being a night owl?Right now in my current team we have agree to start out stand up later on the day so i dont have to wake up very early but i dont know if management knows about it LOL
1
I might get grilled for this, but I'd say give AWS Amplify a look. It simplifies standing up a serverless web app on AWS and generates a lot of plumbing code.
1
I mis-understood the /MIR flag in robocopy and accidentally deleted all the new files created in an afternoon for a state government's largest agency. And because of the backup window, they couldn't be recovered. Fun times.
Thankfully, no one was really mad and just wanted to make sure they weren't going crazy
1
I miss perl..
1
I notice that Gene Kim is an author on all three of the books most recommended or discussed in this thread: The Phoenix Project, The DevOps Handbook, and Accelerate.
1
I once flummoxed a fibre channel array (that was virtualizing storage for a couple of other arrays) by triggering a firmware bug by connecting a new cable to the switch. Unfortunately connecting the cable then configuring the port was the SOP at the and someone had left both ends in an 'up' state with conflicting settings. Normally you'd just have one of the ports error out but on this particular day, the ports had a little argument that ended in the whole SAN going "I am too confused to continue to work"
At that moment I felt a great disturbance in the IT department, as if millions of mailboxes, databases and applications suddenly cried out "wheres my fucking disk gone" and were suddenly silenced
1
I once update the Pluggable Authentication Module (PAM) configuration for a new server I had just migrated to. While I did do a practice run in a development env first and even validated the server was running as intended after, at no point did I fully log out of the server and back in...
A few days later, I did the same update for prod and sure enough, the next morning I went to diagnose some connection failures only to discover that I was not able to log in. Nor were any of my teammates. my update effectively blocked access to anyone or anything forever and ever. In fact, my change effectively told the server to simply fail on any login attempt.
Backups FTW!!!!
1
I picture we can blow up if we can keep our current trajectory (lol. isn't that the story of every start up). We seem to be pretty well positioned amongst other offerings that are over specialized. I don't picture us hiring unicorn status as fast as others but I think we'll get to that level in the next 5 years.
1
I practice "Infrastructure as AWS console" \^\^
1
I prefer Go for actual tooling that need to be distributed, deployed and/or maintained long-term, python for local quick and dirty stuff (and sometimes things you that run containerised). But in reality, there's way too much bash scripting.
There's also some typescript/javascript when my clients use pulimi, and sadly, groovy when Jenkins is involved.
Other than that a bunch of DSL's like terraform, and a bunch of yaml-based-ones like k8s manifests, ansible and a bunch of mostly yaml templating engines the majority based on the golang templating engine (helm, gomplate, custom go templating tools with the sprig library, ...), and stuff like kustomize.
1
I prefer pulumi actually makes sense to me to just write things a language you’re already familiar with but I get why terraform seems to have a larger market share
1
I read both Phoenix project and unicorn, do you think that the goal would provide extra insight? I keep thinking I need to get but seems it would be more of the same but on a different area (manufacturing VS IT)
1
I read enough to go "oh okay, neat" and never finished it. Same with the "DevOps Handbook"
These books basically describe what is fucking obvious to anyone who understands DevOps.
So in that regard, they seem good for educating people who aren't already in the trenches.
1
I read it and it is a great story about how things can escalate if not managed correctly
1
I read it back in 2016, and it triggered a perspective shift in my (young) mind. Maybe the style and storyline were not so great, but the "big picture" view of an IT project with the underlying process from specification to production and the parallel with manufacturing principles really spoke to me at the time. I also read "the goal" wich i liked for the same reasons. As a consultant i use the manufacturing parallel a lot with my clients and start devops process implementation without mentionning any tools (docker, jenkins, sonarqube, ...) and just drawing a release cycle articulated around a branching strategy.
1
I read it the week before starting my current job. Quite honestly the only things that stay with me from that book is the main character had a post it note on his old laptop that said “plug in before use.” I walk in first day expecting my negotiated MacBook and found and old dell laptop with the sticky note “plug into network before use”.
Also had never met anyone name Brent (that’s the super engineer in the book right?). When I started there were two Brents I had to work with.
1
I read it while traveling for Thanksgiving.It was terrible as a novel but better as a protracted business case .
The org I was working in was pioneering Continuous Delivery so we were pretty much at the place where the fictitious business wanted to get to
1
I read it, realized my organization could never change, then left.
1
I read it, twice. It's a good book. My first realization is that it was the reality TV version of the life that all of us in IT lead. It resonates. The two things I found in it that really resonated as something I could and needed to adopt was
1. 4 Types of Work in IT
2. Goldratt's theory of constraints
Both of these were a pretty big awakening for me.
1
I read your name as “benis”probably Angry and thought “benis…like penis? I still don’t get it” and then I realized it’s “Ben Is …”
Yeah, I’m an idiot.
1
I really appreciate your points you bring up. it's definitely very do-able from another language - my team is more java oriented (hence tomcat) so I'd imagine I could even just re use or mimic a lot of the setup QA has for their UI testing. I'm glad you've brought this up, thank you.
1
I really dislike the operator model for Crossplane. It's also a big reason why CloudFormation is operationally dangerous at points. Constantly attempting to go from current to declared state can create operational headaches that Terraform's plan + apply flow avoids entirely (and bonus points for making your TF developer experience much better by using something like env0 or spacelift).
1
I really really appreciate the insight and time you're taking to answer my probably uninformed question. Thank you.
The goal here is to provide a container for someone that has grandma-level computer knowledge but wants to run the container and wants to pay for the resources needed. Thus, web clicking only. (no shade on grandmothers)
Others on this subreddit have suggested that one way to do this is by having a user on my account whose role is to take on the permission from their account to use resources on their account's behalf.
I imagine them getting a pdf walking them through the creation of an IAM account and users on the aws site. Then having them give that user information to my website through a web form. Then I run the container, which gives me a URL to the webserver generated by the container, and they follow the web link.
I am acutely aware that I am probably trying to go about doing things the wrong way, but this is loosely an outline of the only way I understand so far. How would you do it differently?
Incidentally, I'm not a scammer. I'm a bioinformatician trying to provide cutting edge bioinformatics software to flask-and-pipette experimental biologists who don't really use computers much at all.
1
I recently went through a similar change with my team. My initial approach was to absolutely minimize the amount of time spent on project mgmt type activities because I thought the team didn't want it getting in the way of them actually doing their jobs. We were basically just throwing a sentence about whatever it was in a Jira ticket and then assigning it, doing standups and throwing stuff in sprints every 2 weeks and assuming the work would get done.
The result was that we basically got out of it what we put into it - not a whole lot. There was still a lot of frustration from mgmt that stuff was taking way longer than it "should" and there was the feeling that individuals were underperforming because tickets weren't moving.
So we started upping the amount of structure that we wrap around the tickets, and upping the amount of time we allocate to planning. We're now around the 15-20% mark - that's an entire day per week (spread throughout the week) that we spend in planning. We take every task and then break down what the end result should look like and what steps we're going to take to get there. The result is often that we realize there's way more complexity and action that needs to be taken to accomplish something. Our estimates are getting way better and the smaller scope tickets move much more quickly. Overall the team and the mgmt is much happier and the quality of work has increased.
TL;DR: We use Jira. But the real lesson is to invest in your planning. Just throwing stuff into a tool and doing the bare minimum with it isn't likely to net you the results you're looking for.
1
I recommend InfluxDB for simplicity and TimescaleDB for versatility. Which one fits you best? I would move from InfluxDB to TimescaleDB if you need tricks that InfluxDB cant do.
1
I rolled out firmware update to a while hospitals with of hp pcs tested 3 of 4 models in the last guess what happened to the one model that wasn't tested. Wouldn't boot. Needed to visit every one and pull the power to get them to run on then roll firmware back. Luckily this was the first time we did a change control on a major process and when they wanted to come down on me my manager pointed out that no one objected to the plan so they couldn't get mad at me.
1
I second for Crossplane.
1
I second that.
1
I second this - Read up on the different colour switches though, or it may end up driving you mad if you both work from home!
1
I second what gregnorz said, and just wanted to add my own suggestion for "foot in door". (I'm a Sr. SRE at a cloud provider and involved in hiring, fwiw).
Entry level DevOps / SRE roles are pretty rare as it is... by definition, the job requires a pretty solid grasp on both development and operations just to start with, so you're already kind of learning to swim by jumping into the deep end, as it were.
If you want a foot in the door however, I highly suggest the "show, don't tell" approach, no matter where you apply. 90% of my interview time is spent trying to figure out...
1) does this person have the technical skills to do the job?
and
2) can they fit in with the team?
I can't really help with #2, but for #1, pretty much every technical question I ask is geared towards figuring out the candidate's actual skill level vs what they're telling me.
Fastest way to skip the quizzy whiteboarding questions is to point me at your github account and show me what you've done at your current job (if you can) or at home so I can see your work for myself.
Stuff I look for:
* Code style: Neatly written with comments, or spaghetti code with lots of cruft?
* Unit tests: Are they used? Are they effective? What's the coverage like?
* Common systems: What sort of stuff do you have practical experience with? Docker, Jenkins, APIs, home automation, automating the boring stuff, etc.
* Workflows: CI/CD stuff. When you write and push code, do you know how to automate the build and release process? Is the config managed in code?
If you don't have anything, but want a good example project... set up a basic web service / API. Get it to the point where you can just create a pull request, your code will be automatically tested, then when it's merged, it auto-deploys your update and validates the deployment. When the interviewer asks what sort of suff you've done... point them at your github repo and walk them through the process.
If a candidate can do that, we can be fairly sure we can teach the rest of the job without any concern.
1
I see 363 in shipped. Might just be the mobile UI since I’m on my phone.
Yeah from my vague understanding terraform uses the SDK where there are gaps (so like you’d end up doing with custom resources - just already there). In theory the Cloudformation registry should allow community to plug some gaps but I don’t think it’s really got the adoption yet
1
I see a bestemmia, I upvote
1
I see, I've been using their all-in-one solution as a drop-in replacement for Prometheus, but you can break down the system into sub-components and keep what you need. Thanks!
1
I started my job without a lick of production knowledge in Terraform and GCP, took me like 8 months to play catch up but I did it. What I've learned from my 10 years of being in IT is that, if there are examples out there that you can peep or someone who can help do a knowledge transfer with, it's only a matter of time before things start to click.
But examples are your best friend, the nice thing about walking into a company that already has stuff in place is that you often don't need to do everything from scratch and can take a bit from here and there to get what you need done.
1
I started typing in:
rd /s/q \
and then realized this was a mistake. Instead of hitting backspace, my fingers decided to hit enter instead as these keys are somewhat close together. Since I was sitting as C:\, this was the Windows equivalent of `rm -rf /`. I realized my mistake in only a few seconds, but by then, the damage had been done. Thankfully, I was able to restore from Veeam Backup within a few minutes and no data was lost. Since then, I am a lot more careful with this command!
1
I studied telematics way back and especially with wireless there are only so many places where you can put antennas, every so often the government will auction those off. More or less is true for the infrastructure it's not just the permits, it's also how much space is allocated for that sort of stuff, much of that is controlled by local government and largely already allocated to existing companies (most of them who are bhe biggest today bought the most space way back)
1
I suggest you take CKA and/or CKAD certification. Add soMe Go programming to your mix if you really want to get far with Kubernetes
1
I think 4 days on AWS is too much.
If you have limited time do the basics (Linux, CI/CD, Containers, automation), specific cloud environments are like programming languages, if you know one you can get started with another one rather quickly, from my experience it is not a deal breaker for employment.
1
I think a monorepo structure works best with CUE, since you can easily share config across multiple services/tools. I put the 'definition' CUE files alongside all the other files with no special treatment. 'config' CUE files, that is CUE files that I expect to product something like YAML or JSON when they're compiled, I usually put in a dedicated subfolder within a service/tool. If you use Protocol Buffers you can think of them like the .proto files.
[Dagger.io](https://Dagger.io) uses CUE as its definitions language and uses the 'lattice' language design to great effect. You can contrast 'lattice' with 'inheritance', although language nuts might kill me for saying that. You use [Dagger.io](https://Dagger.io) in the same way I defined the 'profiles' earlier. I made my system before [Dagger.io](https://Dagger.io) was formally launched, so you could probably replace my way of doing things with Dagger.io .
For where to draw the lines you can think of it like, "any situation where I use YAML or JSON files I can use CUE". Incremental adoption works well with CUE. You can replace a few YAML service definitions with CUE-based ones first and build up a repository of definitions over time. So best to pick your existing config locations (Kubernetes, app config etc) run \`cue import\` to import the file/folder to CUE, delete the original and make a compilation step to produce that file in CI/CD.
Since it's early days there isn't a whole lot online. I mostly used YouTube and tutorials online to figure out what to do. The official docs are good for learning the language, but they can be quite abstract. The [Slack community](https://join.slack.com/t/cuelang/shared_invite/enQtNzQwODc3NzYzNTA0LTAxNWQwZGU2YWFiOWFiOWQ4MjVjNGQ2ZTNlMmIxODc4MDVjMDg5YmIyOTMyMjQ2MTkzMTU5ZjA1OGE0OGE1NmE) is very helpful and [Github Discussions](https://github.com/cue-lang/cue/discussions) are active.
[Hands-on Introduction to CUE](https://www.youtube.com/watch?v=fR_yApIf6jU)
[Using CUE with Github Actions](https://www.youtube.com/watch?v=Ey3ca0K2h2U&ab_channel=cuelang)
[CUEtorials](https://cuetorials.com/)
1
I think coding is not the skill of the future, thats what youll be doing if/when you master this role, but its absolutely a screwdriver in the tool belt and thats the least useful it will be and no one can really get a lot done without a screwdriver.
Easy skill to overlook as we make our jobs more abstracted and less code driven? Sure. Will it ever go away? No.
I started as a coder and this is where i come from on it. Its not the future of your job but no reason not to know it.
1
I think Datadog may work
1
I think DevOps is like security, it grew into a specialty out of ops. I think the specialized skill set and view of the problem will still exist and is still needed. They are just moving around asses and chairs to make it look like progress
1
I think I understand you better now. Deleted my previous message.
Because I follow many areas in IT, I do have a strategy I follow. I am just not sure how I can explain it without confusing you or anyone who is reading it :D
Thank you for this question. It will help me review my process and refine it along with documenting it for sharing here (If I am able to complete it on time ;-) )
1
I think I've gone through this with every new job. Six months in though, you're the go to guy.
1
I think if you started with Infra ad Code, or learned about Gitops, that would help you a lot. Do things per environment using pull requests. When new code is merged into the develop branch, run some github actions to deploy something to a dev environment.
This one concept can help you with the foundation to build a lot of your other tooling knowledge on top of.
1
I think InfluxDB would fit the bill here.
1
I think it depends on what you want out of said role. If you are looking to get PAID then that could be a good way to go! If you are looking to really get with a group of people passionate about building and maintaining well run, well designed, well managed technology; then outsourced is probably not the right place to be.
1
I think it just means storing your infra declaratively so that you can deploy easily
1
I think it’s tricky in this case.
Hard to say for sure tbh, without knowing exactly what was done.
If it was say a change to egress policy on what routes are announced at their edge routers - then 100% you could do them in segments and validate they are doing what they should.
But there are other types of network changes that need to propagate everywhere at once. The internal network state needs to be consistent or you can get routing loops and traffic blackholed.
Overall it’s hard to say
1
I think its more likely you will get a devops role with the bootcamp training due to your past 6 years of experience in IT. Bootcamps can be really great, especially if its a well known one or has really good credibility.
At the end of the day though, as long as you can prove that you can do the job and know what you are talking about, then thats all that really matters. Often times degrees are needed to just get past recruiters/hr, where as the actual department that hires you could probably care less if you have a degree as long as you know what you are talking about.
1
I think most of the Web3 stuff is bullshit lol. Especially given what's going on in the crypto market. But from an infrastructure perspective it's probably very similar to Web2.0 with a focus on the usual "scalability" "resiliency" etc.
1
I think my point was missed. They are very different approaches but they solve the same issue. In that sense the comparison takes place. Boats are different than planes but we can compare there travel time experience.
1
I think one thing which gets lost in the overall monitoring and observability conversation is that M&O does not replace the need to have depth when it comes to technological expertise. Just because you hear hoof beats and immediately think horses (assuming you are not in a place where Zebras are more predominant than horses) does not mean that a Zebra is going to pop up, and no amount of M&O will help creatively pull the FULL picture of the Zebra together. More over, in a real devops paradigm, when a Zebra is discovered, that should really be triggering conversations to help understand that Zebra, fixes or changes should be put in place to remove the zebra from the landscape and get back to a field full of horses (again, assuming that you are not in a place where Zebras are more predominant than horses).
When it comes to monitoring and observability, i normally live on the line of "If you think there should be a test for a condition, then that test should just be a monitor" (this is in the functional space, not the unit or integration space). With that thought process, you are constantly honing in on the expected state of your system and rooting out all of the unexpected Zebras while ending up with a functional - real time suite of monitors which always describe how you expect your system to act and negative tests that tell you when you are outside the bounds of normal.
1
I think somebody might need to revoke my title
1
I think that Ansible is a good start even if you will be moving to docker images afterwards. Containers don’t happen overnight and bring extra maintenance and troubleshooting with it. You need to get our head in the right space first, think of servers as cattle, not pets.
Install a fresh box, configure it completely using Ansible without logging in yourself. Having the box destroyed should not stress you out, just setup a new one, run Ansible on it, restore data and be done with it.
You might as well Ansiblize excising boxes while your at it and make maintaining all the servers from playbooks.
When that works, setup a container host/cluster using the same principle. Does that work well, then get going with building container images. Stop doing things manually (within boundaries ofc) and automate it all.
1
I think that our fairly small tech org (30 some devs + 3 devops) have actually been fairly good at utilizing the principles in the book. I have had some leeway in that as a devops engineer. It helps we had an IT Director and CTO who saw the advantage of moving into the type of processes that make DevOps possible.
Over the past year, we moved from doing deployments once every two weeks to doing deployments weekly. It’s a slow progression but we’re getting there.
1
I think the biggest challenge is finding someone who is passionate about building sustainable flow vs someone who is hyper interested in a specific tool. Now don't get me wrong, tools are cool and people who build tools for consumption are probably cooler than the tools themselves. But, as someone who is producing products for market consumption; i think the tools are meaningless if you are not able to conceptualize what is important such as flow, agility, building for the future, cleaning up after yourself, not being afraid to show your skills off and go the extra mile in quality (wile not getting taken advantage of by a business)
1
I think the true answer is "it depends" I studied informatics for 8 years for 2 degrees and find that I get a ton of use of all the boring fundamentals I got out of that and it made me good at what I do.
Today though I am probably at mid career, but potentially peaking in terms of capability. So in my case I wouldn't bother with something that would be boring and not increase my knowledge or capability enough vs time invested I would just ditch the book and look for something else. I have a bit over a decade of experience in my field and I feel like the value of book smarts has decreased given my experience. So spending time reading such wordy bible-type books is a waste. I would opt to go for more condensed and immediately relevant research instead. If I were much earlier in my career I would consider going through something like that, but only if I had close to 0 experience or knowledge in that field/topic.
1
I think there is an issue with the statement about the number of issues. The way they use the [roadmap](https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/projects/1#card-84110708) feature seems to leave issues “open” even once they have moved to “Shipped” which accounts for over half the count.
As well as Cfn lint it might be worth taking a look at cfn_nag, Cloudformation guard, hooks, taskcat etc to see if they help with any of the pain points.
1
I think you should get him a goat
1
I think your cousin might be lying to you.
1
I thought so too, but I saw the paychecks and the position listing. They've been in the job for 7 months now with no issues.
The course doesn't make you great at everything in the job, but it say's it'll get you the general knowledge of the standards so you look like a decent hire. They don't even charge the majority of their price until after you've been hired for more than 1.5 months.
It's pretty intensive from their scheduling standpoint, was just curious if anyone on here has gone that route with any success.
1
I took 1 CS class in college, so those courses are not a requirement at all. Maybe automate some of the stuff you read about in your coursework or that you encounter in your internship? Like implement some networking protocol that's interesting to you. That way you can reinforce what you're studying and also learn a new programming language? Just start small with the CS stuff.. you're not going to be writing and designing multi-tiered, infinitely scalable stuff at the start. Just try to write some code, get something working, and either improve it or add features to it as you go.
1
I took over a GCP project from a defunct team. Discovered multiple GKE clusters, with double digit node counts, had been running (idle) for nearly a year.
1
I tried reading it but honestly found it wanting. Its a poorly written fictional take on organizational change. The book is full of straw men, forced dialogue, and I personally found it pretty amateurish. It read a little like the text of a 90's after school special, where the intent is well meaning but the situations are so badly drawn its' hard to suspend your disbelief.
I found Accelerate much easier to read because it was fact based and didn't follow an imaginary narrative
1
I understand the sentiment but that’s not an good interpretation. Both solve the same problem.
The comparison is of how best to arrive at a given solution. Or the inner workings of each. When can compare travelling by boat,plane, car, and train without saying it not a fair comparison because the nature of the comparison is transportation not the category of one thing.
1
I updated the question with my requirements.
We are deploying Webapps, so no need for mobile runners.
1
I upvoted it, not sure what are the problems for other peoples.
What I can tell you is that for me (2~40 Million rows tables) questdb is fast, but to be fast it need to be correctly indexed.
Other annoyances is that it implements a postgresql-like interface. But not all postgresql command or queries will work.
What you could do to decide which solution is best for you is implement a small test and run each type of database in a docker environment amd see which one satisfies you more.
1
I use a mix of GitHub actions and Jenkins.
1
I use a strict calendar that sets aside a certain amount of time per day for learning.
That’s it.
1
I use bitbucket at my current gig and I was surprised by how much I took gitlab and github for granted.
Things feel opaque and a bit disjointed constantly. I wouldn't bother switching to save a buck.
1
I used “Terraform Up and Running” from O’Reilly books when starting Terraform. It walks through a lot of the good practices.
1
I used ADOS for two years at my last company and gitlab for the last two years at my current company ADOS is way better IMO
1
I used it and it was much easier to configure than GitLab CI which I use now. Especially for simple pipelines it’s totally fine.
1
I used to use Python a lot. Now I use Go for everything + bash.
1
I want that! It's not my birthday but.
1
I want to understand why there are 4 downvotes on this comment, is there any major issue in quest?
1
I was a full stack dev for many years then went into DevOps for maybe the last four. It would now be very difficult for me to get a software dev job again.
I did a lot of C# and MS SQL on Windows servers as a dev. Those skills are becoming less and less relevant on the job market.
I don’t know React, Scala, or any of the newer fad languages from the past four years.
There’s always new languages, frameworks and patterns coming out in software development. If you’re out of that discipline for a few years, you fall behind.
I can lead a DevOps team, but I can’t lead a software engineering team anymore. I lack the expertise and experience in the languages to help the rest of a dev team. It doesn’t make sense to most companies to hire a lead engineer who is going to operate as a junior or mid-level engineer for months as they learn the quirks and unique best practices of a particular language. I’ll get beat out by someone with more relevant experience.
1
I was applying for 2 months. I applied for different much more programming oriented position. I was asked about my experience and to grade my knowledge of C++ from 1 to 10. I chose 3. Some chitchat about my experience later (I told engineer interviewing me that lately I was using bash which was true, I have a lot of home automation projects glued with bash and this was also the only "programming" interface I had when hacking into industrial production machines if they were running linux) we said goodbyes with promise to set different meeting for practical exam, I wasn't very optimistic then. I knew I am no serious programmer, I am pretty good at connecting things together though, be it software or hardware. I just wanted a better job and with complete lack of opportunities that aren't bordering slave labor in my home area I was applying to whatever was remote or hybrid.
Not even an hour later another guy from this company is calling me and asks for immediate interview. We talked about how linux works, I was asked to write simple stuff in bash (which I barely did in 30 minutes, I just wasn't fluent, right now I would do it in 3 minutes tops). I wasn't very optimistic after that interview either but a few days later I got a call from him: "You want this job? Here is what we offer...".
That's it I guess. My tips? I guess job search is just a numbers game.
If I was fired from this job today I would write a bot that applies to every remote job in the country. (Actually I think there is some public project that does this for linkedin: https://github.com/nicolomantini/LinkedIn-Easy-Apply-Bot )
Probably You could get away with applying to every job on the continent with some friends and social engineering.
1
I was at a startup very briefly where they straight up admitted to wanting to copy Google's hiring process and only hired grads from a bootcamp that focussed on LeetCode, which is, in part, where I got the idea for this question from.
1
I was happy canceling our Bamboo renewal.
1
I was in your position. Then I noticed that for cookie cutter stuff I'm just typing the same shit every time. Install httpd. Do iptables. Create mysql users.
So I started writing bash scripts with cookie cutter commands. But sometimes I wanted to customize a bit of config differently. So bash vars became cumbersome.
That pushed me into Ansible and templating confs.
By doing manual commands also means you already know to build containers. Docker you'll pick up quick.
For k8s, I started that as well, on-prem, on VM's and some phased out baremetals.
So you have good prereqs but need to push yourself to be more lazy and start automating.
You'll also make less mistakes than c/p commands - that rm - rf from history you accidentaly do on wrong mount...
1
I was in your shoes 4 years ago. Slowly I started taking notes on the procedures that I was taught, whenever I built something infra wise I would create a doc to myself on what I did and why. I did mostly because that's what I thought was sensible, and in part because if the manager left there would no one to remind us how to do something. Now I'm the manager and whenever someone asks something I just send them the article. All new recruits now have access to the knowledge base and even when I don't remember I have the doc there. It is really paying off now, the company is scaling and a new recruit can be effective in a couple of weeks instead of the months it used to take.
1
I was just curious does Apple silicon has anything similar to hardware accelerated virtualization solutions for Intel and AMD? And does this project take advantage of that?
1
I was part of implementation of SOX controls at my workplace. It was a bit of a headache to Say the least
The upshot of it is that our source code check-in AD groups don't contain any overlap with AD groups allowed to promote code to production.
It's hard to be much more specific in a post (for me at least)
1
I was pushing an upgrade to software that configured flash storage on our database servers. We push in stages of ever increasing host counts and during the first few stages things were going fine. Some unrelated event I cannot recall caused us to decide to pause the roll out for a few days. When I resumed the roll out all of the sudden the database processes were being restarted on the hosts being upgraded and since we were in the later stages of the upgrade process, this was happening on thousands of database hosts. This caused a very large messaging service to go offline for a very large number of users.
During my pause of the roll out, someone else pushed a change to configuration management that caused the mounts to be yanked out from under the database process whenever these software packages were updated. This change was not there during the initial stages of the roll out so we didn't see any problems on the hosts being upgraded initially.
So, it wasn't my change exactly that caused the outage, but restarting the roll out of the upgrade (including force-reinstallation of the software packages even if they are already upgraded) instead of pausing would have prevented the outage.
1
I was talking about the tf state which can contain plaintext secrets. That's why it's recommended to use a remote state backend.
For secrets in general something like sops can definitely help if you need to store secrets on disk. Config management tools such as Ansible also allow you to store secrets in AES encrypted vaults so that you can safely commit them to git.
Just make sure to not commit plaintext secrets accidentally and then remove them with another commit or revert. That's not gonna work as the damage is already done. You should directly edit the git history to undo it and then force push or you'll be forced to use bfg to properly clean up the repo.
1
I was very close to asking why you would dig out that well-travelled roadmap again, but on second thought: Thank you :)
To the original poster it might not be as obvious: We have our own tool churn going on in devops :)
And ideally, you wouldn't be building 500 individual solutions to 200 problems, but build a bigger ecosystem that allows you to run your software efficiently(ie. reasonable effort to keep everything up to date and react to bigger configuration changes).. So in a way, operations is becoming a lot like software development and I think it's a good thing.
1
I wasn't in IT support. I was in a very mixed role but mostly focused on DBA/Linux with a decent amount of programming (more so scripting). I should point out that my role is a bit unique where I'm senior within the company but junior in terms of devops experience. I've mostly focused on containerisation and automation tooling (Ansible mostly) as of so far. That is however with years of experience in Linux, coding, etc which is extremely important.
1
I went to a college that focused heavily in theory and maths. The sort of elite place inspired by the French model. During the first years of my engineering degree I was learning stuff like complex analysis, information theory and Fourier analysis.
Being a maths nerd is good for showing off and being pendantic, with a few exceptions (like, you are an academic or in very specific positions).
IMO any DevOps can greatly benefit from previous developer experience. You get to understand why tooling exists and why systems are architected in a certain way.
Keep in mind that, in the end of the day, we are all software engineers, we are all desining, creating and managing software.
Finally, you're just starting out. It's gonna be ,10+ years until you think you know what you're doing. Have fun.
1
I wonder if people think it was written in some language from _magic fairy dust language country_ or if YAML is actually a thing that can be interpreted/compiled.
It should be pretty obvious since, at least, cfengine or puppet that the underlying implementation is mostly just one of the usual languages…
1
I work for one of the major Canadian telcos. Yesterday morning I pushed a config change to our BGP routers.
1
I work on Kubernetes on a daily basis for well over 5 years. I may becoming old and I’ve had my share of brain abusing substances but this shit is not easy and doesn’t become easier with the time that passes. I’ll take Heroku and the likes whenever I can and will always recommend to anybody who needs a solution. I earn well for my knowledge but this ain’t easy and ain’t friendly a bit, sorry folks.
1
I worked as a consultant for a very large company for very large clients after I was outsourced from an in-house team. Pay was good. Always treated like a number though. And, the work was always, always chaos and constant stress. Definitely prefer to be on an in-house team.
1
I worked at a bank that, 15 years ago, worked on ancient tech and as far as I'm aware still does. They're happily in business and making plenty of money for plenty of people.
Because for a bank, being good at banking is a lot more important than scalable tech.
Unless your product is your tech, it's highly unlikely to be your reason for failing. I'm not saying it doesn't happen but making it your primary focus is a mistake and likely not going to be what keeps you running.
1
I worked at a startup that did this, and it bothered me a bit. None of the staff really understood why they were doing it.
But it finally hit me. In a true startup mentality, you may not have time to find and evaluate the best solution for your environment. You just need some solution that will “work”. So along those lines, if it works for FAANG, it must be good.
The tech stack isn’t important. Getting the features out, is.
Another point, sometimes it’s easier to hire ex-FAANG engineers when you have the same platform.
1
I worked for a telco company in 2009 and their billing system had one web component developed in PHP. I uploaded the wrong csv type into it and the result was that about 30k customers in France got their payment type changed from bank account to cheque (I still don’t know what this means).
1
I worry that using two different testing frameworks will result in unbalanced coverage: consider using a BDD testing framework that can be applied to both Linux and Windows hosts, maybe [serverspec](https://serverspec.org/) would have some of the pieces you need?
1
I would certainly do that if my goal was to just build the app through the GUI. My goal is to build the app using my own knowledge and experience in terraform, github, AWS serverless services, HTML/CSS/JS (which is a work in progress), and Python.
1
I would check with them on the saas only. I too need to have the same functionality to replace vault and we just had a full demo from them and to be honest I liked akeyless better for the dynamic secrets and rotation than what we had going on in vault.
1
I would for a more computer literate end user, but this user doesn’t know what containers are and doesn’t really care.
I’m basically trying to create a gui that will pull my container and launch it on their behalf, pending their oauth login or transferred permissions (as suggested in the other reply).
1
I would likely make heavy use of Gitlab then, and the runners + the auto deployment thing is has that I've never tried before. I will also brush up on the term "Gitops".
1
I would not recommend BitBucket Pipelines if you are moving away from GitLab CI. GitHub Actions, probably but definitely not BBP. lol not in its current form.
1
I would politely disagree. If you have used both Terraform and Pulumi as I did, you would know the difference "instantly"
If you want to evaluate Pulumi and compare it with Terraform, you have to understand CDK for TF, without it, you are technically comparing a class definition with a variable:value mapping :-)
1
I would really question the security and practices of any company that would allow that.
1
I would say become very strong with the technologies you use at your job. Then start looking at alternate tools that serve the same function. Identify the hottest ones and work on those. After that then just work on what you find to be the most interesting.
1
I would say go deeper into azure stack aka (azure devops, azure cloud)
Start by learning azure devops, and then how you can deploy cloud infrastructure in the pipeline, ideally using terraform, then how to configure the vms to a certain state using ansible.
Then lastly how to push the app to that infrastructure.
Also start with simple infrastructure , aka vms and web app services.
Then go deeper and start adding NATs VNet and private subnets.
You might want to learn alittle bash just to know few automation tips, unless you will use windows vms then you can learn powershell or Python
1
I would suggest you to start with google cloud.. using qwiklabs. They give one month free. So, you can use multiple accounts for free one month access. They have good tutorials and hands on labs that will help you. Once you are done with it, take an azure free trial and start practicing what you did on gcp. You should feel most of them to be familiar.
1
I wouldn't call that coding, that's configuration. These tools used to be called "configuration management" if that's any tell.
1
I wouldn't install Tor Browser either
Use a VPN when doing that stuff if you have to -- even going to YT
1
I wouldn't recommend InfluxDB for more than a million of unique time series, since it requires **huge** amounts of memory for high-cardinality data (e.g. for high number if unique time series). See, for example, [this benchmark](https://valyala.medium.com/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893).
1
I wrote a puppet provider for this a decade ago, but it was too slow for large quantity record usage.
1
I wrote the question while I got the idea so I hadn’t done my due diligence.
But yes pretty much what I need is to write a web app interface for managing docker containers.
I will look into wrappers for the docker api that seems like a good start or most likely read the docs first.
1
I'd add Eclipse Che to the list :)
1
I'd agree with it being a bad gift. A keyboard is very personal and due to the complexity of thr purchase it does not make a good gift
1
I'd be networking internally. At my company I found a kubernetes slack channel where people from other orgs are willing to help. Not sure I could have done it without standing on the shoulders of others. Also the team that manages kubernetes is very keen to have happy customers so they will take the time to help build configs etc.
If you have no internal resources to network with then may stack have mercy on your soul.
1
I'd just make a webapp, you can always wrap it up in electron later :D
1
I'd look into Cloudfront. We are doing something similar for hosting our app in multiple countries. CF will handle SSL termination for multiple hostnames and domain names (make sure you configure your SSL cert correctly with the appropriate SAN's).
1
I'd recommend it in combination with the DevOps handbook. I've been in a company that really did a DevOps transition and the results were stunning.
Lots of comments are saying it's contrived and they're absolutely correct but its a very simple way to get into a lot of complicated concepts. The Handbook has the practical steps and the Phoenix Project brings it to life (even if it's obviously fictional with odd characters) and makes the points. Don't try to criticise it as not being a phenomenal novel, see it as another way to hear the same messages.
1
I'll say, yes and no...
We got that directive a couple of years ago, and our entire dev environment was created with cloud formation and auto scaling groups, so we scheduled stack deletion or ASG modification to scale things down.
However it was impossible for us to satisfy the "customer" that was the dev shop, because the next day things would spin back up on the wrong branch or on a branch that didn't exist anymore.
So we stopped doing it entirely. 😔
Sometimes cloud platform doing the right thing just gets in the way.
1
I'll start by saying it is a hard job being a one man band dealing with it all, and that there is no shame in doing what you know how to do and do that well.
In the past I would have told you do the greenfield stuff, like take openshift or EKS, lift and shift your monolith in containers, etc...
Now I would say take incremental approach and get value early.
Fundamentally it will be less stressful than a big bang change or trying to lear all at once.
Now I think you need to take one problem/layer at a time and deal with it.
If you dont have yet, make sure to start by hosting a git repo somewhere (eg. gitea) and an tool like jenkins to have a place to trigger your automation bit
Since you have looked at ansible and you plan to do away with VMs at some point, I will not look at the way to automate building VMs (you probably have some sort of golden image already).
Instead make a playbook about core OS stuff like configuring SSH, users, sudoers, installing the tools you need, and generally doing the OS tuning you need.
If basic tunning is different it is ok, you can have different roles.
The good news is that there is some playbook available (I would look for linux hardening ones) that you can inspire yourself from.
Then once you ran those in test VMs, and you know they work, run them in your dev, then in prod.
It is very scary, but a necessary first step.
Make sure to do all your fleet, so it is consistent.
Once you have consistent ssh, start with automating backups.
I'd prioritise data backup, make sure they are uploaded to a cloud (maybe encrypted).
I am talking databases dump, user generated files etc.
Things you cannot make up with the help of a few devs.
Once you have that done you can look at backing up apps configuration.
Thats config files, cron jobs, custom files etc.
Bonus point if it end up in a repo so you can track change the devs made on the server .
Then you can tackle installation of middleware, like PHP version, and making more specialised roles for your different kind of servers.
You can also start to make different VM for your backend, front end, DB etc.
it is a hit in terms of VMs to manage, but by now you should have automated enough to be able to manage.
Once you can get a VM from zero to ready for an app and manage to backup all the configuration for a particular service/set of services(and you manage to restore it) them you are ready to automate the app delivery.
You can start to push config change from the repo (so going from the repo is a backup to a source of truth).
I guess there are ways to do that that are best practices and others, but dont feel like it has to be perfect to be useful.
The process of going from nothing to some automation will help you make the dev make the code better, but thats not instant. Making stuff stateless is hard.
Eg. figuring out how to split your monolith and get the devs to follow your lead on some fundamental changes of process will not happen overnight and splitting that change into many small changes is more palatable.
The whole DevOps mindset is a cultural shift and the hardest bit is to change mentality not tech.
So you might need some support to get there so do give us an update here and there when you get stuck.
TL;DR you can go for all new tech and best practices in one go, or better in my experience: try to do iterative changes where you get a bit of value rapidely for little disruption to the workflow
1
I'll tell Terraform to tell AWS to set up the alerts. Ha!
1
I'm a *huge* fan of separating infra from our code. They change at different times, and there's no *real* reason to store them together.
Infra has its own pipeline and software has its own pipeline. They never trigger each other and don't interact in any way except for our actual environments.
I've done this with terragrunt/terraform pre-tf-cloud days and we do it with Pulumi and ArgoCD now.
1
I'm a fan of [question driven development](https://nickjanetakis.com/blog/learning-a-new-web-framework-with-question-driven-development) for learning almost anything that can be researched.
Basically you learn things as you go and focus on fast feedback loops between wanting to do something, looking it up and then doing it.
Personally I find it very unnatural to try and read 900 pages up front. I retain things best when I'm actively doing something, trying to solve my own personal problems and generally tinkering with things.
Keeping these "things" isolated to small tiny tasks is why I like question driven development. It's never "how do I learn Kubernetes" or "learning Python".
It's always something bite sized like if you were trying to learn more about Kubernetes you might find https://kubernetes.io/docs/concepts/overview/components/.
The first sentence is "When you deploy Kubernetes, you get a cluster.". If you don't know what a cluster is then your journey begins with "what is a Kubernetes cluster?" and you go from there.
1
I'm a fan of GitHub Actions or Concourse CI
I've also used a LOT of different CI solutions and in my professional experience overwhelmingly the biggest thing that drove complex requirements for a CI system was unnecessarily complex underlying build or deploy architecture.
When I have opportunities to get close to the code base of the projects being built, or can define the contract between the repos and the systems, it's much easier to make scripts or services which just need a simple runner to invoke them than to try and rely on complex orchestration features of a CI platform.
Example would be rather than trying to coordinate the change of multiple systems in a parallel manner using CI runners, write a script that offloads that to e.g. a k8s cluster which is actually designed for that use case. Your CI system can simply block the job and poll if you want to wait on status completion, or your k8s tasks could include one that monitors its own jobs and reports back when it's done. The CI job could even just return a k8s invocation/job ID or similar, allowing the CI job to quickly exit, and then an operator could query back the status of the job ID as needed (if you wanted to minimize CI runner cost).
1
I'm a lead software engineer that does "devops".
To answer your question, it really depends on the individual company's devops culture. For us, we transitioned from a "devops" role to a "platform engineer" role. This is a software engineering role with a focus on infrastructure, tooling, in-house platform services and dev guidance for using all of the above. I write code *all* *day*, C#/F#/Go/Typescript/Bash/Terraform primarily.
If your scope of work just includes writing shell/glue scripts and Terraform, you will learn some valuable things, but your programming skills will start to slip, especially more up to date knowledge of new libraries/frameworks.
Jobs exist for both kinds of roles.
1
I'm a strong advocate for documenting, so my frustration is getting other people on board. Statements like "the code is the documentation" make me roll my eyes.
Documentation **should** exist at each level to the degree that makes sense for the project.
I'd rather find a stale doc than no doc.
1
I'm at a shop that uses bitbucket because we use all the other atlassian products. 10/10 do not recommend.
GitLab or GitHub or azure DevOps. All 3 are great
1
I'm curious what people here think about in terms of Pulumi adding more testability and static checking by being written in an actual programming language.
I'm curious what footguns people have had with Pulumi. Lots of people are just saying that without any specifics -- and I can definitely imagine a company just having policies to write Pulumi in a more sane language than C# for this sort of thing (Python or Typescript, probably) with proper code linting tools to make sure you're not following anti patterns.
As someone who works at a company that uses neither but rather something custom in combo with Cloudformation, all I want is for the company to have written "templates" using an actual programming language with proper inputs and outputs, rather than whatever they do now. I'm not familiar enough with Terraform to be sure (nor do I want to be too specific about my company), but I'd think that my company would have a harder time with something too declarative -- rather than having a programming language where you can still be declarative, but just with proper inputs and outputs.
1
I'm full remote. I had a local VMware host that I was connected to via PowerCLI...
I also happened to be connected to the production network and vCenter, from the same machine.
I forgot this, when I issued a command to shut down all VMs where PowerState = PoweredOn
🤦♂️
MASSIVE outage on the work network. Almost lost my job.
1
I'm grateful for the help, it's helping me to grasp what I could possibly do to solve those sorts of things. For now, I will start with the monitoring and some sort of lab to test out provisioning things automatically. Thank you :)
1
I'm happy to use Terraform CDK, Pulumi, etc... when it becomes the defacto standard but it's not. The standard is still Terraform/Terragrunt or CloudFormation. That's what every job description asks for. The end.
P.S. I'm sorry you feel sad about this fact.
1
I'm just experiencing how dramatically limited, inconsistent , slow and unstable GH actions are. I can compare current pretty complex architecture - to medium sized project I used with CircleCI some year ago and it was actually just flawless - it was heaven vs hell now.
1
I'm late to the party but I find it an interesting topic.
Any follow-up on your end?
In my experience taking the initial step to management is a 2 year change process, give or take. We are basically finding our own way of managing and we learn to view organizations differently -- and it hurts.
Now to your question. You can of course go back; but statistics show only 20% (if I remember correctly) are fully satisfied. The change has just been too impactful.
Any unsolicited advice from me? Jump ship entirely? You've seen organizational dysfunction. You will not un-see it and it will not go away easily. Accept or leave. It sounds to me you should be able to find something new. Manager or not doesn't sound like the real question to me.
Cheers
1
I'm looking for a practical book with steps and case studies. So The DevOps Handbook is the go to?
1
I'm looking to switch to be full time in the field. I'm currently SWE with pretty extensive knowledge in system (unix/win), very comfortable in the cmd with bash/powershell and have worked with k8s and docker the last few years.
How do I convince my company to transfer me into a devops role?
1
I'm not actually a programmer - Im a sysadmin but will answer anyway.
A large client site experienced a crypto ransomware hack via rdp port 3389 breach. Entire site cryptod and restored.
Rdp 3389 to the rds server was subsequently shut down. No staff were informed besides the server project team which I was not a member of.
Being on the helpdesk at the time I received a call from the customer saying they couldn't rdp. So I checked the firewall and opened the port to resolve the issue.
Around a year later they were crypto ransomware attacked via the same method yet again. This time they lost weeks of data as backs were failing.
That was probably my most destructive fuck up but at the same time, fuck the boss and management for not even telling people about the crypto attack and change.
1
I'm not saying any of those languages don't have their own standard. I'm referring to the DevOps community in general, which as a standard, uses Terraform and Terragrunt. Then CloudFormation as a distant second, then pick a flavor of CDK in third place.
I do actually feel like Terraform CDK will ultimately be the future but when will it ever rise to being adopted at the rate of raw Terraform?...Who knows.
1
I'm not sure if you're in AWS, but have you looked into SSM?
In short, it's a daemon that runs on your AWS instances and listens for commands sent via the Amazon API.
You can then issue specific commands to your servers, which can then, for example, execute a script to do a "pull" type deploy.
Example: A server is configured with SSM. Your CICD system that runs on-premises creates a build artifact (i.e. jar file and database migrations) and uploads it to S3.
Then, the CICD system executes an awscli command that tells the server to execute an Ansible playbook which will pull down the artifact, run database migrations, and run the deploy.
1
I'm not sure what you're saying about "correctness". If someone has made modifications to the upstream resources, Pulumi will still tell you that's happened if you make changes.
The assumption here is that people aren't making ad-hoc changes to resources. If you know they are, you can refresh the resources. It doesn't change the correctness at all.
1
I'm not sure you understand what DevOps really means. It certainly doesn't mean you write application code and build infrastructure. Just because we're not writing application code doesn't make us "Ops" engineers. Applications are complex and so is infrastructure, monitoring, ci/cd, etc. and I would never expect or want an engineer to do both of those things. No one does that in real life and if they try, it's fantastically horrible.
1
I'm slowly switching over to terraform and CloudFlare.
1
I'm sorry but you didn't provide any guidance. All you said was "Google and figure it out" while OP was on a good path. You just sound like a bitter person that gets off on smacking juniors to feel that you're closer to the senior level. Nothing constructive in your behavior.
Then OP only mentioned infrastructure in the post and you feel the need to bring up complexity of the code? The code can be as simple as crud code and it's still a valid time sheet app.
1
I'm sorry for replying slowly; I wasn't able to check Reddit. I'm very grateful for your willingness to guide me, and I've sent you a DM.
1
I'm sweating already just imagining it
1
I'm thinking of building an open source version myself.
I used F5 GTM.
I can do the same with PowerDNS + Pipe backend and doing geoIP and then talking to Haproxy status endpoint for load balancing out of region if the local region is broken.
Alternatively I can use this https://jameshfisher.com/2017/08/04/golang-dns-server/
For the fancy DNS server itself.
Every availability zone in every region shall know of every load balancer in every other. They only use them if there is a failure in the local region on LTM
1
I'm up for any idea that helps people get back in the zone. Would it suit everyone? Probably not... but it's worth trying for those that get constant interruptions. As much as some places try their best, work from home has a breaking effect where it's hard just to jump up and ask things, or discuss over lunch. You actually miss the cues that's is OK to step in and chat to someone. I do my best to send a message for the Dev to call or message me after a meeting or lunch, or before they leave for the day so I don't break their flow, but it's not always possible.
1
I'm using Drone (on premises.)
>I need to be able to run many job parallel
It can do that. You have a drone-server instance and it will farm jobs out to multiple hosts via 'runners'. Each runner can run as many jobs as you define. You're really only limited by what the hardware can do. It can spawn jobs via k8s, various cloud providers, etc. I don't have use for those features so I can't speak to them.
>tag runners like some runners need to be dedicated for only some jobs
It can do that. You can tag a runner for a specific need and target that runner. You can also restrict that runner to a given repository or organization. There are some restrictions to using tags, but it works per your comment.
>I need to be able to run jobs on push/commit, open pr, closed pr, schedule, and manually
You can configure triggers to do most or all of these.
My use case is pretty simplistic so I haven't tried the other things you want feedback on, thus no opinion. Drone does operate in two mechanisms Open Source and Enterprise.
* Drone OSS you can't use runners, you can only run jobs wherever the drone-server is running. However, you can run as many jobs as you want.
* Drone Enterprise you can use for free for a limited number of builds per year. I forget the cap, something like 5,000. Plenty for a single non-commercial user. For commercial use you pay per user per month but you also have to pay support annually (which is about 10% of the user licensing.)
So if you wanted to test it out, you could easily do so and if it fit your need you could convert to an enterprise license.
Note: Support is actually useful. I discovered a bug and got it resolved relatively quickly whereas directly with their dev team, vs interacting via discord.
My take: Drone documentation sucks. Setting up a server/runner and integrating it with a SCM could be a lot better documented. That being said, once it's up and running it works well.
1
I'm using it in my public [incidents-k8s Github Actions](https://github.com/chazapp/incidents-k8s/actions). Haven't fixed yet all the issues it found, but got down some of them. It taught me some pretty neat trick, but issues it founds for k8s a pretty obscure and complex to implement.
1
I'm working with a few other people I've met on reddit setting up a homelab from the ground up. We're trying to incorporate as much devops tomfoolery as we can as well as some more general web dev stuff. If you're interested in joining in (or anyone else for that matter) just add me on discord. Zexanima#0329
1
I've already said I use Bash and am familiar with Python.
I've been an engineer for 25+ years at Fortune 5 companies with 400k employees to startups with 200 people and I have yet to be called to "write some bespoke software". My team supports devs in very massive ways with CI/CD, Infra, Monitoring, Alerting, deployment tools/methods, and on and on...but I've never been "dragged into supporting devs". If a software development team can't figure out a problem and they think asking a DevOps Engineer will do the trick, I'd say that team has bigger issues to solve.
I don't know where you're coming up with this angle but it's definitely not what I see happening at companies between DevOps and Software teams.
1
I've been in the DevOps game close to 10 years. Before covid there was a lot more regional comp philosophy but in this new Era of remote work you earn USD not Colorado-bucks. Some companies are still coming around but if they want to stay competitive and not just get what's left, they'll change.
Don't sell yourself short, learn quickly and fake it until you make it. If they only have one person typically in that role its probably all FUBAR anyways and you can blame the previous person for at least 1, maybe 2 years. Join some slack communities to run ideas or get help.
1
I've been in this position before -- expected 9-5 but it was 7:30am to evenings 12 hour days
There was no other Devops/SRE on the team -- we had a backend engineer who built some of the Kubernetes stuff but I had to maintain it
Startup very small -- using GitOps
I was taking stimulants to work crazy hours -- lost sleep and it affected my work performance and was making simple mistakes -- my opinion
LEAVE -- its not worth it
I learned a lot and met some good contacts and used that to leverage a better offer-- if you have Kubernetes knowledge along with Terraform you are gold -- not a lot of people know it
I didn't find Kubernetes -- Kubernetes found me
1
I've only ever been fired once in my life, and it was 28 days after joining a new company. I tried to install steam on my company laptop, to which I was blocked. I thought to myself "ok fair enough, they have software restriction policies, no biggie" didn't think any more about it. 2 days later they escorted me off the premises because I "breached security policies".
Lucky for me, the way the DevOps market works I had another job within 3 weeks but 🤷♂️
1
I've read half of The Phoenix Project.
The intention of the authors is good. But the writing and contrievances are just terrible. The whole book reads as power-fantasy of operations person, who discovers this magic "kanban" thing and then uses it to turn around disfunctional company around in record time. All by proving everyone is not as perfect as the ops guy, who cannot do no wrong.
And as a programmer. I found that the book treated technical issues as a trivial to solve. That by showing programmers such a thing as kanban and cloud exists, programmers would, without any complains, turn the whole codebase around into cloud within weeks. And if some person would complain, there would be some contrieved way that the protagonist convinces the person they are clearly wrong and they would start worshipping the protagonist.
And let us not forget the book teaches us how to pass a security review in clearly insecure and broken company. Of course by having friends in the audit group.
Sure, read it for the ideas. But take it with significant amount of salt. And don't imagine by showing people Kanban, they would be able to bring all your applications into cloud with continuous delivery within weeks. In practice, those take years of focused effort and investment of time and money to make that happen.
1
I've read it and enjoyed reading it. The stories were amusing.
Don't take this book or the stories as instruction. Find out the teaching and techniques the person is applying and for what reason. I thinks those teaching still has values in the current organisational structure specially in a bad one.
1
I've read it but I wouldn't say I'm impressed with it, the people in it act like robots not real people, if you treated actual employees like many of the characters I think many of them would quit.
1
I've started requiring sensible variable names to pass PRs because someone is more likely to read that than the doc block above it.
1
I've tried to setup something like this in the past but had many issues. I understand you probably can't share source, but is it a CSV or something that terraform is populating record values with and you're committing that?
The version I ended up with is pretty much yaml objects with all of the r53 record fields populating it.
1
I've used GitHub Actions for small things like backing up the repository to an S3 bucket upon merge using an OIDC connection and IAM role that's in a GitHub Secret. Works great and no complaints. The secret is only as secure as its implementation.
1
I've used Jenkins for smaller jobs, but people I know who use it for larger jobs hate it.
I use harness next gen and absolutely love it.
1
I've worked at large companies who've adopted public cloud, which is understandable given the cost savings, but they've been convinced by various consultants to also adopt FAANG like delivery processes for which they have no need and just adds complexity and unnecessary overhead.
1
I've worked for startups and FAANG and I found you're right.
I've also found that FAANG has the inverse problem: they see the velocity of startups and they try to chase the buzzwords they hear without actually implementing it correctly, either.
1
I've worked in dev lead shops who go with pulumi as "everyone's a dev" and the devs pretty much do the devops.
Also worked in shops where devs are completely separated from devops and devops engineers handle the infra, in which case they use TF.
Really depends on the context. I like both.
1
I've written with all these IaC frameworks for years and can say without a shadow of a doubt the programming perspective (Pulumi with constructs or cdks) is in most situations the best choice.
1
I've yet had a problem that I couldn't solve with plain terraform/hcl. But I've seen incapable devs struggle with the declarative nature of hcl.
1
I’d be curious to know what you think are footguns in Pulumi; honestly, it sounds like you haven’t used it much.
I’ve had way, way more issues maintaining large Terraform projects vs large Pulumi projects. Pulumi trivializes some things like backends, importing existing architecture, moving things between projects/stacks, visualizing changes, and staging out how things are deployed and what is dependent on what (without deploying your projects in a specific order).
None of that even gets to the fact you’re now writing code instead of a markup language, which has its own benefits- especially when using Typescript, and can benefit from things like type checking against managed policy ARNs in AWS.
1
I’d echo what other people are saying and just create a systemd service for the bot. That way you just copy the files and restart the service and you don’t have worry about it being the foreground. However, using the ampersand should work though since it’ll just run in the background.
1
I’d stay with your current offering, but that’s me. What state are you in?
1
I’d think it was stealing my data.
1
I’ll be sure to look into this
1
I’m a financial analyst interviewing for a DevSecOps role next week. I’m qualified but it’s because I got a Computer Science degree on the side. Look into post-baccs. If you have one degree already, it’s only 2 years part-time. I did it while working full-time (but it was tough). I did Oregon State’s online CS post-bacc. Happy to answer any questions on it.
I’m not saying you can’t get into DevOps without a degree, but you’d definitely be qualified for a role in it if you did get one.
1
I’m confused does your company have developers?
1
I’m looking for methodology, and strategy that DevOps Engineers use to keep up to date with learning goals and the industry - Not so much I’m learning these because it’s going to make me the most
1
I’m lost as to why you think you can’t just “give” them the container.
1
I’m never doing k8s again. Ever. For new things lambda pushes so many things below the line of the AWS shared responsibility.
Worth it.
1
I’m no longer with the company, so I don’t have a detailed view although I stay in touch with some old colleagues. I don’t expect there to be much cost savings though, as the limiting scaling factor was disk usage. The cluster used some 800 i3en.3xlarge.
Lots of savings on operations though, and probably some network traffic costs. We had massive amounts of tooling for cluster optimization, shard placement and reducing cross zone replication.
1
I’m not familiar with what you’re referring to. Can you give me a pointer?
1
I’m not sure the answer will be quite satisfactory. I started with 8gb memory and a 1tb hdd from 2016. That didn’t cut it, the hdd swamped everything on startup and couldn’t keep up. The page file was huge. So, you need ssd storage and at least 16gb memory. I have 32gb memory now and I don’t recall the utilization but it’s comfortable.
The rig has a Core-i7 3770k from 2013 too but that hasn’t been a bottleneck yet.
1
I’m not sure these are the distinctions I’ve ever seen fought over with something as involved as documentation, but I’d say text below image, following a general caption pattern.
I’d argue for something that’s not formatting driven. Keep it lined up as you wish but we tend to use figure markers if it’s not obvious. “Click on the button (1.a)”, so then image caption becomes “1.a”.
1
I’ve been in IT for almost 4 years and I’m not even confident to apply to a DevOps position, is there something im missing here?
1
I’ve been using Teckton as our standard in OpenShift, our on-prem PaaS, and now that we have AWS as a railed public cloud offering, we must use Terraform for our IaC. Kind of frustrated that this was announced as an FYI and our a small, 5 person devOps (3 technical) team (one small group of many of various sizes in our enterprise) are a bit concerned of the steep learning curve considering everything else we need to maintain while we consider jumping to this platform.
Any recommendations on some good Learning tools to put us on the fast track for learning Terraform?
1
I’ve had multiple conversations where I’m supposed to magically fix ES problems as you cited. I tell them to back it up to appropriate storage and I get stared at like I’ve grown a third eye.
1
I’ve interviewed more than 100 candidates during the last 5-6 months and i can share some key things i never thought about back in the days when looking for work.
1. Regardless how awesome you are technically you will be member of a team. A large part of interview process will be centered around the question “Do I want to have him/her as my colleague?”.
Be the best teammate you’ve ever had and try to be that guy every day.
2. Don’t lie or try to be someone you’re not. The guy in front of you have probably seen/heard it all.
3. How/which tools you use daily reveals more about your pass work experience than the coding test itself
1
I’ve only heard this term once with some company called soda. I trailed it and had mixed feelings…
1
I’ve used CDK - it’s fun for small projects and if you have to drop an L3 in place somewhere quickly, but it’s a nightmare to work with in the long haul.
The interfaces lack consistency even across related infrastructure resources, and they are very brittle overall.
I doubt Terraform CDK will ever see the wide adoption Terraform has. Don’t get me wrong, we use both (Terraform is mostly used where we don’t have a Pulumi provider), and I’d pick basic Terraform any day over CDK.
1
IaaC but in Clojure? (It’s all data)
1
Ideally the only things inline are bootstrap code to fetch and run things from version control. Storing scripts inline is the path to madness.
1
Ideally, no one has access to production
1
Ideally, you might have a service-oriented architecture where each thing lives in it's own "service". This makes it easier to do updates and push changes.
1
Idk if it THE go-to by any means, but it is well recognized and the companion to The Phoenix Project. Not nearly as “fun” to read, but I’d definitely point you in that direction. I refer to it regularly.
1
If a user does commit to the repo wouldn’t it just trigger the entire pipeline again? That would be the pipeline I would assume you would want to actually deploy anyways.
If the pipeline fails for any reason how is that handled? Is it that big of a deal to just manually rerun the pipeline?
1
If DevOps or any variant of DevOps was a University course, you would never get a degree.
1
If he's reading RFCs it made me think of when I was 20 and had a plaque of "man ascii" on the door of my bathroom. man ascii is a manual page from Linux that shows [the ascii table](https://man7.org/linux/man-pages/man7/ascii.7.html).
It would be a cheap but perhaps thoughtful and very nerdy gift.
1
If I am writing tools for my team and not open distro then yea, I provide a docker image with everything built in instead of providing them a module, the need to build a module for a tooling can be overkill
1
If it's on a linux box, you can find and remove files older than x days with `find`, something like:
```
find /path/to/builds/ -type f -mtime +15 -exec rm -f {} \;
```
1
If it’s for application access you can use Mozilla SOPS to just encrypt committed configs.
https://autrilla.gitbooks.io/sops/content/
1
If one of your switches has DHCP capabilities let it manage it for the machines. Any virtual machines or containers deployed on top of them they should get addresses from their orchestrators.
1
If someone EVER finds a good solution to this problem, I am all ears and will pay almost any price.
I have reflected a lot in my free time about issues like this, and came to the conclusion that the problems on a microcosm are the same as happening on a macrocosm: How do you take notes, and are you using your notes regularly? For most of the people, the answer is a knowledge collection that is browsed here and then.
What seems to solve it for me on a micro-level, i.e. for my personal notes, is using the Zettelkasten method. However, one of the key-principles of it are preventing it from being applicable to teams and documentation: Namely that the Zettelkasten is very individual. if one can scale this method and find a way to apply the principles to a team, then there is a chance... But I don't see it yet.
1
If someone is looking for a free solution - AWS Parameter Store.
Doesnt have automatic secret updates, but a simple cron->lamba does the same.
1
If something is outsourced then it generally means that they want a low-cost company to maintain something. It's hardly exciting work so I would struggle with it unless the pay was well above market rates.
1
If the autos cater takes too long to spin up more pods, your min pods should be higher or your scaling threshold should be lower.
1
If the shop is using Kubernetes, 100%, if not they might still be k8s-curious.
1
If they bring over the ease of use of DevOps to GitHub actions I’d be so freaking happy
My pipelines are basically.. here’s a window, type bash or powershell in them and the servers will execute… so easy
1
If they try to offer something lower simply because of location not being CA, I'd explain to them housing market and inflation are across the board, not location based. Costs are up everywhere and rapidly increasing. This puts you in a position to where you have to account for future increases and would *hate to have to leave them* early into a position because the salary didn't account for this.
1
If this default is indicative that pulumi as a whole philosophically prioritizes performance over correctness, this is worrying!
1
If you *don't* have massive imposter syndrome three weeks in are you even really doing DevOps?
What you're feeling is totally normal. You're drinking from the firehouse. Just keep trucking!
1
If you are comparing Terraform and Pulumi, I would recommend you try CDF for TF first.
[https://www.terraform.io/cdktf](https://www.terraform.io/cdktf)
Once you try this, you are ready to compare TF with Pulumi.
Till then you are comparing apples to oranges. ;-)
1
If you are dealing with bare metal, or vms. ansible is the best tool to start on. You can secure your secrets easily, and with a git repo you can have some version control, change history, easy roll backs, etc.
If you are already using SSH to connect, the infrastructure is in place to use ansible, just install on your machine. Tons of training material available.
1
If you are looking for a non-technical explanation of agile development then this [article](https://linearb.io/blog/how-to-improve-your-agile-ceremonies/) should be helpful.
1
If you are on AWS just use AWS Secrets Manager. It is very simple to get started with.
1
If you are outside the norms, which i personally think is absolutely fine, just make sure you take it on urself to make sure the team is perpetually informed on what you are doing, your blockers, and make sure to ask for help when convenient for your team mates. as an industry, we should worry less about when people work and we should focus more on doing good work and expecting good from our teammates. we are all adults and not only should we act like adults, we got into technology because we like it. no PM, or manager, or fancy title will influence us otherwise.
1
If you are really choosing between GitHub actions and Jenkins, go with GitHub actions. Jenkins is old, almost a whole generation behind. Yes you can customize everything- but just about everyone ends up with a mess.
I’ve migrated customers to GitHub actions and we write the same pipelines with 90% less code. GitHub actions is a modern ci/cd system.
I do agree with others that it’s short sighted to move off gitlab to save some cost- all of the competitors are essentially the same and you will pay somewhere, eventually.
1
If you are switching to Github actions, maybe u will face the problem of job parallelism. I mean, there is available job parallelism only on GitHub hosted runners, for self-hosted I don’t think there is a option for parallel job execution.
1
If you aren’t programming in your devops job it’s probably not devops.
1
If you asked me 6 months ago, I would say inhouse, but the company I worked for started a reorg and here I am now working for a outsoursing company (working with a startup from Berlin) 😅 It depends, mostly on the client/project. Some projects can be great, some you want to run away a couple of months in. Good outsoursing companies should be flexible in those cases when change of project is requested.
1
If you can separate out the secrets that would be best, but yeah source controlling secrets is dicey. I have seen repos being managed separately for those and... it works but it is messy. Build agents need access and a limited Dev/SysOps group ... it is just painful. Better to use something purpose built.
1
If you differentiate the builds you intentionally want to retain (e.g. past 5 releases per service) from other builds, you can both exclude retained builds from the policy and also reduce the storage tier to save money.
1
If you don't want to trust Ansible, trust the hordes of others who use it every day :)
[Here is a sample playbook](https://github.com/ChadDa3mon/Examples/blob/main/ubuntu_base.yaml) I use at home to ensure all of my Ubuntu VMs are configured the same way.
1
If you have a CKA it will certainly help land interviews with company's that are using k8s or looking to migrate. What will be more important in the interview is selling yourself and your willingness to learn/grow. I think taking the initiative to get a CKA (no an easy exam) will help speak for that to an extent.
As with any certifications though they don't necessarily teach you practical skills. You'll still want to read up on things like argoCD, Helm, and terraform for example. There is a CKA course on udemy that gives hands on lab access to clusters on kodecloud. It's a great way to skill up on k8s with practice and ultimately a sought after certification.
Not all companies use k8s though so it entirely depends on where you're applying.
1
If you have a problem with reverse engineering one liner bash script, sorry but your 15 years of “career” were wasted.
Even a junior would solve that. Its not about the task per se, but how you react to something new.
Not being able to use “man” or google immediately disqualifies you as someone who cannot work in stress or at all tbh.
1
If you have an agile team or work around the agile methodology then nTask is the tool that can help you make your agile development smoother and get more done in less time. It is a robust tool that can manage your tasks, projects, issues, risks, and teams all in one app - nTask. There is no need to switch between different apps to manage your work.
1
If you have multiple pod replicas for your nginx ingress controllers and your application deployment, I would start by reducing them both to 1 replica each. This way you can start ruling out any inconsistencies with the way the nginx controllers or the k8s Service abstraction handles load balancing.
1
If you have Proxmox I have a guide series for using terraform to deploy VMs and then Ansible for getting kubernetes on said VMs. It’s a lot more straightforward than some of the giant playbooks, mostly because my Ansible skill level is beginner.
https://austinsnerdythings.com/2021/08/30/how-to-create-a-proxmox-ubuntu-cloud-init-image/
https://austinsnerdythings.com/2021/09/01/how-to-deploy-vms-in-proxmox-with-terraform/
https://austinsnerdythings.com/2021/09/23/deploying-kubernetes-vms-in-proxmox-with-terraform/
https://austinsnerdythings.com/2022/04/25/deploying-a-kubernetes-cluster-within-proxmox-using-ansible/
1
If you haven't broken prod at least once in your career can you really be a devops/sysops/developer? :D
I once updated a ticket on an ITSM platform and somehow managed to trigger an automation loop that took down our multi-million dollar platform for 3 hours. The best part is every time it restarted the loop-o-death would also restart.
1
If you know Python, setting up Ansible plays shouldn't be too hard.
Whats nice, you can setup custom banners, aliases and all those non essential nice to haves as well.
Cool story, btw.
1
If you know your way around Linux you're more than set for entry-level.
1
If you need to change stuff on all hosts then you update your ansible playbook and run it again, that's the whole idea.
Your last point is the entryway of host grouping in ansible, you can create groups of hosts and include different playbooks for them, so hosts in group A will have apache+php and group B will have Nginx and Python and group C will have MySQL
Apart from that you will have a global playbook to setup LDAP, general monitoring, log forwarding and stuff that all servers have in common
1
If you set bucket policy to deny, only root account can reset the permissions. If you do that on a KMS key resource policy, only AWS support can.
1
If you use scripted pipelines then check out my shared library https://github.com/DontShaveTheYak/jenkins-std-lib
1
If you wait for final requirements in a fast-moving world, you'll never deliver anything. No product is ever final, it's always just enough.
1
If you want 2 different apps on the same domain, you need a reverse proxy. You can use a different domain, but then you'll need to enable CORS.
1
if you want to also take the chance to run this with containerization
you can run the EC2 workloads as Fargate tasks (will be resilient to reboots)
have the tasks be target-grouped for AWS Elastic Load Balancer . You just need max 2 Elastic IP (EIP) for this
ELB almost never goes down unlike running your own Nginx. Fargate is less complex than EKS (Fisher-Price containerization). it will provide SSL termination (you must also provision an AWS Cert Manager certificate in the same zone)
Your routing would be then Infra As Code from your TF / CF yamls.
Updating the workload spec with new versions (eg with new value file or with the specified CF args) will also grant rudimentary CICD functionality.
1
If you want to switch from 1 resource to 3, you'll need to use `terraform import` on the 2 existing resources to make the state happy.
1
If you were my employee, I would want you to feel comfortable saying you've got issues with your sleep and it can make you miss half a day. It's a weird one but being aware that this might sometimes happen is good for a manager to know so they can expect it and deal with it when it happens. I've been woken up by colleagues calling asking where I am, if you have troubles with your sleep then things like this can happen.
I'd offer to call you if you were late beyond a normal time ie 10am in case the same thing happens. Also that it's fair to take a day off to catch up on significant sleep loss.
1
If you work with kubernetes and terraform, that's not something you do once in a year. Also, they don't pollute the environment either. They are themselves just binaries.
1
If you're new to the IT industry, chances are that you won't start out doing devops work. You'll need a solid grounding in development processes, application infrastructure, networks, operating system mechanics, and application troubleshooting.
1
If you're on the Azure stack, then Azure DevOps is a decent option, especially considering the amount of integration it has with cards and DevOps automation
1
If you're using TF Cloud, you can have a workspace that watches for changes and then auto-applies.
Run an apply that instantiates a cluster, sets up IAM and load balancers, and throws a templated file with an Argo bootstrap into the monitored workspace. Terraform picks it up after the first run is complete, applies it, and now you have whatever apps you want running from a single manual apply.
1
If you’re only looking for a simple scheduler/orchestrator also take a look at HashiCorp‘s Nomad.
1
If you’re targeting local you could join a local tech discord out slack group. It’s also a good resource for jobs.
1
If your base is an aws metric, you're going to have the same reporting cadence that's configured for that aws resource. So by default, unless you've enabled advanced metrics in aws for your elb, you're getting metrics only once every 5 minutes.
This is the same constraint you'd have in cloudwatch. You probably want to enable advanced metrics if it's a user impacting elb.
1
If your devs don't have a standard setup, docker isn't going to fix it.
1
If your domain zone is in google workspace, than there.
1
If your job is to mope a room, then yes,
* get a roomba,
* make it vaccum one room while you vaccum the other.
This will
1) expedite your work
2) cut your efforts by half and
3) You can use the additional time on your hand to improve the quality of your work.
In my opinion that is expedited delivery of quality work. It is DevOps.
[https://www.idgconnect.com/article/3579778/what-does-the-rise-of-devops-mean-for-agile.html](https://www.idgconnect.com/article/3579778/what-does-the-rise-of-devops-mean-for-agile.html)
1
If your project is library intended for use in other projects, packaging allows to easily install it via pip or poetry
1
Im not sure i got your point. But im just saying that that calling a OPs company "shitty" does not help him to solve his problem in any way. I mean, ok he got the massage and what he should do now? Make a resignation??
1
Im pretty new to DevOps was sort of thrown into it as a SWE. I’ve learned how my team works with GCP/Cloud, Terraform , Jenkins & learned to complete tasks butbnownim using Kode Kloud to build foundations for all the skills Git, Linux , Docker, Kubernetes, Jenkins, Ansible , cloud
Also im studying for GCP certifications.
Personal development wise My approach was I want to become highly skilled as a DevOps engineer to open my chances to higher levels, so now I just find the skills needed to get there then work on them
1
Im using advanced metrics and have them in CW with 1m interval.
But I cannot configure (or have found yet) an option to scral few metrics from ELB every 1min.
1
image legends are usually below :)
I prefer multiple column also, but otherwise if I don't have the choice, I explain how to get to that screen where I did the screenshot, place screenshot, then details.
1
Imagine github wiki version references being updated BEFORE a real release happens. Or Github pages point to non-existing version.
That would be the case for updating BEFORE release.
It's a common practice to keep your documentation together with the source-code like you yourself mentioned.
There is a difference in managing internal documentation and documentation that is visible to everyone on the internet.
1
IMO DSLs are good in that they restrict people doing "more than just a loop". They enforce simplicity.
It's not for all use cases, sure, but the majority are better off without shit code.
1
Imo whether or not your git workflow uses a gui or not doesn’t matter at all. At the end of the day it’s all doing the same thing under the hood.
1
IMO, some security people get too attached to theoretical security instead of practical security. If you are using AWS for your networking, compute, and encryption keys, then using them as your secret manager seems minor.
If you aren't willing to trust a cloud providers networking, compute, and encryption, then you shouldn't be using them. It's the half-in, half-out that I think is security theater.
1
IMO, the Phoenix project is a book useful to managers. You can't apply its goodness when only being a pawn.
1
Import the git project into Gitlab, then rebuild and publish into Gitlab's internal registry.
1
Imposter syndrome is a killer for the first 3 months at any Devops or infrastructure engineering job.
It's unfortunate but the reality is that most startups or small tech companies need to get shit done fast, and documentation- especially the basics are often overlooked.
Don't be too distraught! You got this! If you can manage VMs in a production environment I'm positive you can pick up some docker basics quickly. You got this. Don't be afraid of asking for help!
2
Imposter syndrome, it's normal. Seems like you have a great opportunity to learn, the struggle pays off when you gain the knowledge.
1
In 2003 or so, I was on an IP kvm clicking through
a SharePoint server troubleshooting a plugin I wrote, on prod none the less. IP kvm was jittery and I got frustrated and did the click, click click click + wiggle mouse around. I dragged and dropped a bunch of folders into a sub folder which broke SharePoint. The hot backup was still running from previous night and the tape backups just got picked up by Iron Mountain for off site storage.
SharePoint was down for 3 days. The org just transitioned all their workflows to SharePoint the month before. Oops.
1
In a "I need a webserver, a database and a couple of cronjobs/api endpoints" way, that's about it.. Even though the actual application still has to be written :D
1
In a word, yes.
At a certain scale, you'll want to consolidate build logic to a central point, rather than maintain project pipline copypasta in as many discrete git repos. In your case, this means reducing the pipeline config/code for each active project branch to install and run a common script.
The goal is to have a minimal project pipeline configuration that simply "installs and runs" one or more stock pipeline scripts. Everything else goes in some DevOps pipeline repo. The trick to this is deciding on a distribution method. This can be as primitive as a live `git clone` at the top of your pipeline jobs, or as sophisticated as docker images, packages, GitLab-CI include statements, or what-have-you. Once you have that, the rest should fall into place.
In general, it is best to treat your pipeline scripts like any piece of software and follow software development best practice. Nevermind that it's written in Powershell, BASH, or anything else. Keep things generic and reusable. Build a library of common functions and re-use code. Document everything. Build tests that can validate/invalidate code changes. Run your pipeline codebase itself through CI to run tests, exercise a sandboxed project, and package things up for use downstream.
1
in addition to the other poster's response, it's a plaintext [CD metadata format](https://en.wikipedia.org/wiki/Cue_sheet_(computing\))
1
In addition, writing the blog articles will give you opportunity to identify missing things and structure the things you’ve learned.
1
In case others are looking:
\- [AWS Secret Manager](https://aws.amazon.com/secrets-manager/)
\- [Azure Key Vault](https://docs.microsoft.com/en-us/azure/key-vault/)
\- [GCP Secret Manager](https://cloud.google.com/secret-manager)
1
In general Microservices like K8s allow you to decouple applications... moving away from monolithic structures.
1
In kuberbetes, with helm, I use to configure a pre-install,pre-update (or something similar) hook to create a job to run only the migration and, if it apply the DB migration successfully, it install the new deployment.
This pattern enable a faster start for all pods (because they don't need to apply the migration) and it fail-fast and abort the upgrade if the migration is broken.
This pattern also prevent your application is stuck in DB migration lock on start if you have a cluster crash or several nodes are down at the same time.
1
In my case it's been more an issue to trying to Terraform DRYly. You end up making heavy use of `for_each` and config maps built in `locals` and two months later your Terraform project gains sentience and tries to start judgement day.
1
in my CI, I push to gitlab AND dockerhub. One action is to push the latest for each push. The other is when tags are pushed.
```yaml
push:
tags:
- docker
- linux
stage: push
only:
- master
script:
- echo $CI_JOB_TOKEN | docker login YOUR_GITLAB_REGISTRY -u gitlab-ci-token --password-stdin
- docker push YOUR_GITLAB_REGISTRY/YOUR_PROJECT:latest
- echo $DOCKERHUB_PASS | docker login -u $DOCKERHUB_USER --password-stdin
- docker tag YOUR_GITLAB_REGISTRY/YOUR_PROJECT:latest YOUR_NAMESPACE_DOCKERHUB/YOUR_PROJECT:latest
- docker push YOUR_NAMESPACE_DOCKERHUB/YOUR_PROJECT:latest
push_version:
tags:
- docker
- linux
stage: push_version
only:
- tags
script:
- echo $CI_JOB_TOKEN | docker login YOUR_GITLAB_REGISTRY -u gitlab-ci-token --password-stdin
- docker tag YOUR_GITLAB_REGISTRY/YOUR_PROJECT:latest YOUR_GITLAB_REGISTRY/YOUR_PROJECT:$CI_COMMIT_TAG
- docker push YOUR_GITLAB_REGISTRY/YOUR_PROJECT:$CI_COMMIT_TAG
- echo $DOCKERHUB_PASS | docker login -u $DOCKERHUB_USER --password-stdin
- docker tag YOUR_GITLAB_REGISTRY/YOUR_PROJECT:$CI_COMMIT_TAG YOUR_NAMESPACE_DOCKERHUB/YOUR_PROJECT:$CI_COMMIT_TAG
- docker push YOUR_NAMESPACE_DOCKERHUB/YOUR_PROJECT:$CI_COMMIT_TAG
```
1
In my company, we have 1 bitbucket workspace. In it, each team has their own projects. Each project has multiple repos for frontend, backend etc.
1
In my experience it has worked sometimes before 1.0.0, but a Hashicorp employee referenced this one time as "undefined behaviour" to me. 🤷🏻♂️
Further there is this an open issue from 2015 with 400 upvotes. So I guess I'm not alone. https://github.com/hashicorp/terraform/issues/2430
1
In my experience, how demanding a job is correlates inversely with how much it pays.
1
In my experience, serverless services have most of their testing as traditional unit or component tests. Special scaffolding may be needed to invoke the code from a test runner (e.g. emulating the input to a Lambda function).
If you want to run functional or end-to-end tests on a serverless service, you typically need an infrastructure-as-code (e.g. CloudFormation) description of your app so that you can create and destroy resources programmatically.
Finally, it is common to have multiple versions of the service set up in parallel so that you can do manual verification of changes before you roll them out. Of course, if you have a CloudFormation template of your service, this is trivial to accomplish.
1
In my experience, when developers also run operations, all the infrastructure they provision is massively insecure, ports open all over the place, they rarely have a plan for patching, and no plans for repeatable infrastructure, or any sort of project standardization. I say this as someone who came from the dev side, but who was, when I moved to DevOps, tightly integrated with ops/networking/security for years and found most of the ops side’s complaints about the requests from the dev side to be valid (asking for packages to be installed on servers that had known vulnerabilities, requesting too much access to servers, etc…). Most of the roles I’ve been in were on teams that were providing IaaS or PaaS for developers who were external either to IT, or were offshore contractors, YMMV.
1
In my job search for a small company, noone has asked for leetcode. I saw a job posting or two asking for knowledge of algorithms, and just didn't apply. Kubernetes seems to be a common requirement though.
1
In my last job we rolled it out to all Linux VMs and never looked back. Would do it again.
1
In my previous work we used BB as git vendor and jenkins as CICD tool. We quite often had availability issues with bb - service was down, we couldnt commit, it disturbed our work in very least expected times.
But integration with jenkins was very good, webhooks in bb plugin worked really well. We had some nginx rules to allow only traffic from bb servers accessing webhooks ports.
1
In the bad old days (when we built physical infrastructure) we used to have public and private subnets as we used to protect the network. Most often we didn’t have IP or even physical port level security beyond a VLAN - a firewall pair would protect one or more subnets as a whole, and anything on the same subnets would usually be able to talk to each other without issue on any port.
In the modern era it’s less of a need as we secure to the individual IP (less often) or tag or SG. The only time you care public or private is when you are doing strict security and working to prevent ingress/egress via a particular mechanism.
That said, Ive walked into a bucket load of infrastructures that still use IP ranges as a source address which you shouldn’t do if it’s internal to your infrastructure.
1
In the case of k8s, it’s useful to use these trending/popular tools because the talent you are attracting will want to use these tools.
1
In the long run, the scarcity of dev skills makes your labor far more valuable. This in turn allows you to negotiate offers at companies that offer far more than just work life balance.
1
In the middle of the same transition but to EKS. Congrats on the transition, that’s a lot of work. Hope you found the value we’re expecting to find too! :D
1
In this context data is code. Anything you have under version control.
1
In your README.md, consider including an `Example Output` section underneath the `Usage` section.
1
Influx has historically not done well for sufficiently large cardinality data sets. The licensing is garbage, too, with everything being commercial support.
1
Influxdb, informix, prometheus.
1
InfluxDB?
1
Inform the company and have the A record removed.
Depending on what your jurisdiction is I'd keep a record of your attempts to contact the company in order to be prepared *if* someone is accusing you of "hacking" their DNS.
That would totally *not* surprise me unfortunately. People have been indicted before for manipulating the DOM of a company website in their *own* webbrowser, so...
1
Infrastructure as mp4
1
Infrastructure as what? Lmao
1
Install Arch Linux and tinker around. Compile some programs. Play around with git. Be curious.
1
Instead they pollute your main system with large redundant images you don't need!
1
Intellij crying in the corner ; _ ;
Everyone fancy already switched to vscode I assume?
1
Interested in average monthly costs say for a team of 10 devs?
1
IPv6 provides virtually 'unlimited' amount of addresses, so each and every NIC can have it's own unique IP (Though it is unknown for me if it comes from spec)
NAT for IPv4 is a consequence of the fact that we are out of free IP addresses since a few years.
1
Ironically, internal documentation as code was one of my biggest frustrations. So much effort maintaining it for little benefit over SaaS solutions.
1
Is that actually a thing?
1
Is this the devops version of "tabs" vs. "spaces"?
1
Is your lifestyle is going to change with this 40k p.a?
1
Is your team hiring my company has no idea what devops means
1
Isn't Kubernetes Infra as Data?
1
Isn't the whole idea of devops "you build it, you run it", rather than having a separate ops department? If so logically they should have access to prod and be able to run the scripts themselves, if they don't have the required skills they should develop them within the team.
1
Isn't this exactly what Influxdb were designed for?
1
Isn’t the tag the reference of the build?
So you’re saying I essentially cannot do this:
docker pull docker.io/image_name:1.0
docker push registry.gitlab.com/image_name:1.0
?
1
It all depends on the firm and whether their DevOps engineers are just scripters or whether they are working full-stack with dev teams to deploy and support the scaling of software.
What I will say is that your skills in any particular framework will likely atrophy to some extent - new features get added and you will not have time to follow the ins-and-outs of any particular framework.
However, in many companies, you will get exposed to a plethora of different programming languages and frameworks as you deploy everything from DotNet to Ruby on Rails to Java code, all the while contributing to efforts like refactoring how apps are doing connection pooling, circuit breaking, or configuration so that they perform better in production.
In summary, some of your skills will atrophy, such as skills specific to a framework, but in a true devops role you will grow as a software engineer in other ways.
1
It blows my mind that they still don’t support signed commits.
1
It can be the hard part. Like when the company is already invested heavily in the Atlassian cancer to the point of having things set in stone by professional ‘agile consultants’. (i know, it’s still not a tool problem per se)
1
It definitely wont hurt, and can probably help, as others have said. But--and this is something that can only come with experience, I think--it's always worth asking during interviews why a company is choosing K8s.
I used to think it added unnecessary complexity, and to a certain extent, I still do: but my refined grug-brain developer mindset has grown over the years to appreciate what K8s does.
Today, I *still* think K8s adds a lot of complexity, because it does, but now my questions in interviews are an attempt to ascertain if that complexity is truly needed to do the things companies are hiring SREs to do.
Most of the time the answer is "yeah no you don't need k8s for this and you don't need an SRE either" and I move on, but sometimes the answer is "okay, that's a good reason to adopt kubernetes. Let's talk more about this"
1
It depends :) I had an outsourcing stint in my career some time ago and it was nice to be embedded in an enterprisey environment with good management(ie. they let me do/implement things.. otherwise all the red tape can be a turn-off)..
I also enjoyed working with different customers, different environments and totally different industries.. But that only applies if you're interested in who's sitting on the other end of the network cable and get a chance to get in contact with the people you're servicing.
But what ultimately drove me off was that everything had a price tag to it. People don't outsource if they can do it themselves for less money, they usually come in with the expectation that they will be worked on an assembly line, but still manage to come up with very unique problems :) That might not be a bad thing on projects, because often, one-off costs are just shrugged away, but in the long run, everything you do, every task you perform is considered cost that will decrease the profit.
While that's not a bad thing in itself(especially if you're supposed to hand off things to cheaper colleagues abroad anyway and you're just supposed to deal with the customers, soak up as much knowledge as possible and hand it over to someone else).. And working with people from all over the world can be a fun thing, too.. But I've had little opportunity to move things on a big scale.. eg. we often purchased a lot of small, new servers for a new customer instead of investing beforehand and buying some big iron and use it for multiple customers.. In the end, it would have been cheaper, but it's probably harder to sell to people.
But all that "Go to the customer, pick up everything there is and recreate it in our data center" became a bit boring over time..
1
It depends.
Programming in usual DevOps roles is limited to small and simple scripts. You're not generally concerned with object-oriented programming, fancy data structures, algorithms, software design patterns, etc.
I would say the difference is like 5th grade math compared to university level math. No offense to anyone, I'm in DevOps too but lets face the facts here.
That being said...
If you genuinely enjoy both aspects (programming and DevOps), look for **Software Engineer - Infrastructure** type positions. They will generally have a deep focus on programming and infrastructure, like the name suggests.
1
it does go on forever, and then it changes underneath you. It's like a foot deep, mile wide river going 60mph.
1
It does seem like Apple's \`Virtualization.Framework\` that we use for Tart is tightly integrated with the M1/M2 chips. We did [some syntactic benchmarks](https://browser.geekbench.com/v5/cpu/compare/14966395?baseline=14966339) and saw almost no overhead from having the virtualization.
1
It feels impossible. I dedicated a few days documenting a bunch of stuff a year ago. Since then I've not had the time to document anything else and quite a bit of that documentation is out of date.
1
It hasn't always been the case, to their credit Azure does seem to be investing heavily in their AKS ecosystem and that is evident by the improvements they have made. That was very much so not the case 2 years ago (my most recent AKS usage at scale was about a year ago). There are a lot of things they could be doing to have a tighter release schedule ranging from how their k8s plugins for supporting network/storage interfaces are designed/tested (usually a major hurdle), to how willing they are to absorb support requests. You can look at [GKE](https://cloud.google.com/kubernetes-engine/docs/release-notes-stable) for another data point - every provider has test and integration processes unique to their environment that dictate their release cycle and none are all that similar. The more bespoke the integrations (or running something like k3s) the rougher the timelines are going to be for stable releases.
1
It helps knowing a bit about all of them when debugging why an application doesn’t run in the infra. But in general python and bash. Sometimes golang since some Devops tools are written in this language.
1
It is basically moving the actuator further away from the data and removing a lot of logic. Essentially GitOps. cue lang is a nice example of how to embrace this and build some nice tooling around it. However i haven't had the chance to fully implement it somewhere.
Also, I don't think we need another X-as-Y acronym...
1
It is renewing and returning ssls
1
It is, so is the era of developers making those decisions “alone” because it’s easier for them.
More conversations are happening and we’ll have quite few years ahead of us where we all need to have these discussions around the _who uses what and why_ plus _who makes the decisions_.
1
It might still be helpful to have them all in a central place instead of gathering them from dozens of AWS accounts during an audit.
1
It seems a bit unclean to do this, though. I know it's possible but I'd rather have them as clean as possible when I get to this point.
1
It seems it is, but I did not know :^ ] heard this buzzword today.
1
It sounds like they want to fill the role asap, which isn't unusual. The question is whether you will mind being a one-person team. If others can fill in while you take vacations, etc, and you think you can learn well on your own, this is probably not a big red flag. That's how it is sometimes with smaller companies and teams.
The third question, did they say your salary was outside of their budget? They should be up front about that, at least.
1
It sounds like you need some framework / guidelines to guide all this amount of work.
You can take a look at The DevOps Handbook for a good starting point, and use it to draw a "Roadmap to DevOps", with different milestones for your company.
1
It sounds like your company just heard the buzzword "devops" and decided that it sounded cool and they should have a "devops" person(s) because reasons, or you are conflating all things cloud with devops (which is a common misconception, so don't feel bad if that is the case)
That said, you should just focus on 1 thing at a time just like you would as an IT admin or engineer. What are your goals and what are some options you have to reach those goals? Maybe you need help with what options are available at all? Do you want automated infra deployments or do you just need CI/CD to manage your existing codebase? Are you using containers, VMs or both; any k8s? What sort of Azure services are you currently leveraging? Are your applications even ready to be deployed in a cloud environment at all? Maybe all or some of this is just word salad to you and you need more baseline help to get started?
If you are really looking to adopt for real devops, where are things culturally? Have you adopted real Agile practices like SCRUM or kanban, if not, which options make sense for your org? Are you currently closely working with the dev team to learn each others needs and wants? What sort of development method are you and/or the dev team currently using, what kind should you pivot to if need be? Who from the dev team will be forming the devops team with ops? Don't forget the whole point of devops is the merging of development and operations, siloed dev and ops teams isn't real devops!
If there's anything I learned going from ops to "devops" it's that at the core of the work, I don't do anything differently than when I was a sysadmin, only the tooling and some work culture things changed, but my job is still; assess the goals/issues, come up with possible solutions based on research I do, build out a PoC, extensive testing of the PoC and tweaking things to fit our environment, planning a role out of the finial solution, rolling out said solution and spot fixing any overlooked issues (should be, or very close to, 0 issues though if testing went well).
1
IT support to devops is a bit unusual and you're unlikely to have the skills required to land a job. I've done a devops boot camp but it was very introductory and the key thing was that I was already transitioning to devops from a similar role.
Not sure I have any particular advise other than be ready to dedicate a shed load of your personal time to bridge the gap. Are you looking to transition to a DevOps engineer with your current employer? That could be a bit different and more viable, however if you're in the UK the bootcamp will only be partially funded, unless your employer makes up the rest of the cost. Good luck 😊
1
It was a lot later than that… There were a lot of performance and cost reasons why an upgrade was not prioritized, but let’s just say that after this incident it became a very high priority.
1
It will be difficult to say without knowing the exact shape of your data, rate of cardinality growth, and how many un-indexed fields you will require, but my org currently uses on-prem InfluxDB Enterprise, our total cardinality is less than 50 million unique series, and our annual license cost is around $50k/yr however compute and storage is another ~$20k.
1
It would be pretty hard to run up a bill on this project.
1
It would help to know what the script is doing
1
It's a combination of two things. 1) You said the word "Kubernetes" and backed it up with a project at work. 2) DevOps roles are massively hard to source. DevOps with actual k8s experience, be it managed Kubernetes or not, is even harder to find.
1
It's a fun book! If you use it to stimulate thought and discussion, it's useful. If you try to use it like a template to fix orgs, it's not so useful.
1
It's a transitional thing.. it happens when you change some of the fundamentals of how your environment works and part of the job is planning for such events :)
You could either add the script to the branches that don't have it or have a look if Azure Pipelines allows for conditional build steps.. So you can skip the script execution if it isn't there.
1
It's because tech can scale very well. Take a salesman or a plummer. No matter how hard he works, he can only serve one client at the time. Even doctors, being highly valuable, with much demanding on skills and knowledge bring don't bring that much because it can't scale.
Take a good mechanic, he has the same problem. He can't scale much himself. Give him a camera and a good quality videos, his knowledge is shared to millions. There's a few garagist that makes way more money on YouTube just with the ad revenue than actually fixing cars and that's from the scale of the value they brings.
Doing well in tech? Your efforts providing valuable changes to the business can be scaled to thousands, millions, billions of users. Your ducktaping is worth it, even if it's not harder. Why? Because they pay for value created not how hard to job actually is. Being competitive in a market in tech is worth it because of the scale of it.
Just take most prominent startups or unicorn company, they didn't invent the needs, they just took existing services and scaled their impact trough tech.
1
It's been pretty much all you posted for the past week in several subreddits, how is that not spamming? :D
1
It's definitely possible that there's an alternative way to achieve similar results with chained commands, although I wasn't aware of that and this was still a fun personal project.
I'll definitely take a read into that though, as I'm quite curious now 😃
1
It's either that or a reasonably good chiclet keyboard for fast typers, because they offer a nice feedback if you actually pressed the key or not. I "feel" my typos before I see them, because the keys didn't feel just right :)
1
It's gonna be infrastructure as IPv6 ... any day now
1
It's just part of this myth that k8s is insanely hard to use.
Also, LinkedIn has a very simple and dumb algorithm: if you updated something recently, you come up first in searches that relates to you
1
It's markup; the question is implicitly aching for Turing complete languages. I can't write a for loop in YAML, and so help me I might scream if they try to implement that in the next version.
1
It's much easier to say "let's do X because big company X also does it -- it must be best practice!" rather than to think the problem through to the end and consider all of the practicalities, scaling issues, and maintainability of your approach.
Especially for the admins that aren't actually coding
1
It's not as complicated as you probably think it is to ensure stable and supportable k8s builds get out the door. Even if you're backporting patches. Should be embarrassing for the big 3 to see these small companies like Digital Ocean, Linode, and Vultr ship newer k8s versions faster than them.
1
It's not as much "talent" as you think. If you don't have a competitive personality, it adds yet another layer of difficulty to the job. Also many people pretend its easy when the ly are steuggling. You seem very honest. That is a good thing. I think everyone is nearly useless in their first six months. You're probably not getting a lot of good tickets yet. If they're not giving you tasking that makes you better and helps you grow, then part of this is a leadership problem. I would say just hang in there, but...if you get to critical levels of misery, try to change roles or even companies.
1
It's not bad to know how things work on bare metal. In fact it will only help you when you start doing things through the multiple layers of abstraction of provisioning VMs and then building containers to run on top of them.
My path started with Vagrant, then test-kitchen to build entire systems and even systems of systems from scratch using chef and/or Ansible (and also some verification testing which will be useful for CICD pipelines later).
From there use ansible to build, configure, and deploy docker swarms and containers on top of your infrastructure. The goal is to be able to go from zero to a complete environment in a few commands in a few minutes, so you're never afraid to burn the entire thing down and rebuild it from scratch like everything's disposable. Most of the effort here will go into version pinning, where docker especially shines.
From there use these tools to build your monitoring and CICD pipelines on Jenkins or gitlab runners or whatever. Docker really shines here since a lot of these tools still run on Java and you don't want to get stuck maintaining JVMs outside of a container. Eventually you have automation set up so touchless deploys can happen through dev/test/staging/prod without your intervention at all... even rollbacks when automated tests fail.
Have fun holding it down!
1
It's not hard but different than the LAMP stack it "replaced". So it takes time. Anecdotally, it took me a year of work to really grok it (vanilla K8s, CKA and so on).
1
It's not rocket science to us - it's mundane.
And yet massive, multi-billion dollar tech companies have spent about a decade now failing to achieve exactly that task.
I run a company selling a simple unified DevOps service, and even major tech companies see containerized apps and serverless architecture as something akin to magic.
1
It's not that simple. Look at the logic behind any mass market timesheet app and you will want to die. Here's a short nonexhaustive example list of things that will make you want to eventually cry:
* Time off requests
* Time off balance accruals (with maximums)
* Approvals for time off
* Substitute approvals
* Deep integration into HR systems
* Permissions on timesheet visibility
* Complex rules on nonexempt staff pay and PTO accrual which depends on the jurisdiction the employee works (hope you have a lawyer on the BAR for every jurisdiction you intend to support with this app)
* Tracking profitability (integration into a project management system that tracks project revenues and nonpersonnel cost)
* Tracking holidays, both official and company-given
* Tracking "floating holidays", which give an extra day automatically when not taken
* Expenses
* Business travel
* Email integration
* Auditability of any and all actions
* ...
There's a reason that each and every timesheet app is a huge pile of shit on the technology side: the tech isn't a profit driver and it doesn't matter. The real value is in the business logic behind the scenes, nobody really cares if the app looks like ass on a mobile device or takes 10 seconds to load in between clicks.
If this is just a project to show "hey neat I know some webdev and DBs" then yeah sure throw it together. But don't be fooled that it's a replacement for any market timesheet solution.
1
It's possible they mistyped the IP. It's also possible they had that IP before you got it.
1
It's possible, but you have to have the social skills and good will of management to get that done. Many engineers, unfortunately, don't have the social skills. I include myself in that sometimes. I'm still learning how to "influence without authority" in many settings.
As far as good will goes, if you're new you generally have free reign for a while to change shit up without much resistance. And if you've been there a long time with a lot of success on your record, you can as well. Anyone else trying to make these kinds of changes will meet a lot of resistance though.
Just because this is easy for you, don't assume it is for most people.
1
It's possible, obviously, but you are going to be working late nights for six months to a year trying to learn what you need to complete your daily work.
1
It's real easy to say make the developers answer the 3 AM call. When your best developers get fed up and leave then the policy typically changes. It is hard enough for companies to find good developers but finding good developers that can handle the infrastructure part and are willing to be on-call? You better start paying FAANG salaries for that.
1
It's really going to depend on the job. But learning by doing and being self-driven is what makes you excel technically in my experience.
You could set up CI pipelines for your projects.
1
It's so extremely vast with so many moving parts that the hardest part is trying to remember how to make it do the specific things in YOUR use case
1
It's still common but it's very old-fashioned to have the changelog in the repo.
A) Your Git commit history is your changelog (if you do it right)
B) Platforms like GitHub allow to create releases with artifacts and release notes (manually or automatically via the pipeline, often extracted from the Git commit history). And they purposefully have this as a separate feature that does not change the repo. Same reason as for build artifacts: The same information is already in the repo. You can re-create the changelog whenever you need. But if you put it in the repo, you have the same information twice and sooner or later those two sources will deviate.
1
It's still tough to tell what you're actually asking. The Docker Engine API docs are [available online](https://docs.docker.com/engine/api/v1.41/).
Are you asking how to make a web app that interfaces with a HTTP API? If you're looking for libraries that wrap up the docker engine API you'll need to state your preferred language at least.
It would be really useful if you could state a lot more detail of your question.
* What are you trying to achieve?
* What do you know about it already?
* What are you missing and what specifically are you asking for suggestions on?
1
It's true. However, I've never met someone who can do both of these things well. That doesn't mean you can't, but the vast majority of real world situations are companies that have dedicated software development teams and a dedicated DevOps team. In this scenario, the argument you make doesn't add value to either of those teams by themselves.
1
It’s a decent read and a useful intro. The DevOps Handbook is a roadmap for the practical application of this approach, and I have a preference for it over The Phoenix Project. As such, it is quite a bit more tedious and technical, so I regularly recommend both suggesting the target choose according to their tastes.
As far as orgs go, I can’t say that I’ve worked with one that implements the strategies across the board. But, I have worked to push the application of some of the concepts which has led to positive impacts for the companies and my career.
Like u/Zauxst mentioned, you need top-down buy-in for some of it to be successful, but if you have leadership that at least listens to ideas, the handbook might help you put together a specific strategy that you can bring to leadership.
1
It’s a robust platform with lots of interlocking parts (that you can also disregard if you prefer).
Have a solution for private Nuget and NPM packages? Cool, use that. Want to use Azure Artifacts? Also cool, knock yourself out.
I prefer the look and feel of GitHub for managing repositories, but ADO handles it fine, and there are LOTS of extensions to improve the experience.
Self host build agents, host them in a cloud provider, or let Microsoft handle a serverless option.
We end up using the API integration a lot to make the build and deploy pipelines integrate a bit more (and automate some silly stuff) and the documentation is typical Microsoft - verbose, voluminous, but very effective.
Anyway, they’re not paying me, so let me also point out it’s probably the ‘least cool’ option….but point being I think it gets underestimated because it covers so much ground and gives so much optionality.
1
It’s lame too because LinkedIn has a hard requirement that you keep replying to them or they mark you as not actually looking
1
It’s not even really just v6.
We ended up here cos we ran out of v4 addresses and ended up building NATs.
Now people wrongly assume we have private space / NAT as a security thing. Which is not really the case. There are a few “fat finger” config mistakes it might reduce the attack surface from, but there are “fat finger” mistakes you can make with NATs which expose you also.
1
It’s possible to run a Flask app directly out of lambda. I’ve got one running behind an ALB that basically renders a pretty format for a DynamoDB table. I’d have to go find the library I used but there’s a Python library that lets you run a WSGI based app in lambda.
2
Its a greater than.
1
Its a security measure.
1
Its declarative approach pushes you to write direct flow controls with all those conditions, loops, etc. It adds a vast field for mistakes to the task of idempotent infrastructure management. Just imagine Kubernetes manifests written in some procedural / OOP language - ugh!
And don't get me wrong - I'm not against programming. Conversely, I'm always glad to apply my skills to develop some handy tools or backend apps in C# or Python. I just prefer to rely upon DSL when it's appropriate. And it's nonsense that programmers couldn't learn new languages, be it general purpose or domain specific. We [create](https://www.manning.com/books/domain-specific-languages-made-easy) the new ones when it reliefs our (and users') life.
1
Its done. Move on. Time to call it. Start an only fans.
1
Its lack of existance
1
Its much easier than that and won’t take months. Most of the time solving a very simple problem is more than enough.
What tool you use, how well you navigate while coding, how you structure the code with naming, how long does it takes, how many mistakes you made ..etc are much more revealing than solving a super complicated algorithm
1
its very good and they have done a lot of research & written other books around devops. All of their books & the pheonix project is based on the devops research they have conducted. i'd highly recommend "Accelerate" and "The DevOps Handbook"
1
Ive done AWS migration from 2.3 to 7.2, we were able to decrease monthly cost from 30k usd to 5k.
Not even a thank you from management ehh. For them 25k was not significant enough to put 2 ppl for 2 months to do the migration - we did it either way.
1
Jenkins <==> Jank-ins.
1
Jenkins at my place
1
Jenkins is a known tool, and I too dislike it.
That said, since it is a bit everywhere it has lots of plugins which can come handy at times.
it is not too hard to maintain, but you may be prevented from upgrading when you use a plugin that is not mauntained.
I used GoCD, which like jenkins is easy to spin up.
It looks a bit better than jenkins in my opinion.
Then, if your workload is kubernetes based, argoCD (or flux) is where the hype is.
Very clean workflow and other projects in the argo world help plug some of the CI aspects.
In terms of git stuff, look up gitea for no nonsense, I rather host that than gitlab, which got very complex in recent years.
if you want no maintenance, github and circle CI seems to be where people are at, but I'm not a big fan of github, especially since they will start to push azure heavilly
1
Jenkins is for sure the best because it's the only CICD tool that let's you use a full programming language to build your pipelines. All the others are just bash and yaml.
If your builds are simple like docker and kubernetes then those other tools will work fine. But if you have more complicated workflows then Jenkins is where it's at.
Jenkins is hard to learn because the documentation isn't the best and it's confusing because the pipeline documentation overlaps with groovy documentation but not all of it applies.
People complain about Jenkins plugins but it takes a pretty large Jenkins to have enough users and enough plugins that you run into trouble updating Jenkins. I've never personally had problems.
But I also built my a lot of functionality into my shared library https://github.com/DontShaveTheYak/jenkins-std-lib so I don't have to use plugins. My main use case is that I want to be able to run my pipelines on any Jenkins. Which means I can't count on a plugin being installed.
If your going to go with jenkins, there is a lot of cool stuff in that repo. Like running Jenkins locally in a dev container.
1
Jesus Christ you guys learned a right way? *There* ***is*** *a right way?!*
1
Jira is defacto standard because it let business people grab the reins and control agile in the way the business wants. This is almost always not want developer wants.
1
Just be careful with the costs, replication gets expensive very quickly. Regarding IaC, check out [Terraforming](https://github.com/dtan4/terraforming), it allows you to generate state files from existing AWS deployments, in that way you could do everything with the UI first.
1
just because you learn a tool doesn't mean you can get a job. It's more so gluing shit together and having a high-level understanding of a bunch of shit. You don't need to know stuff in depth but know how to google and learn it quick.
1
just buy another laptop ... also docker
1
Just curious, can it be used efficiently to store any type of time series, not just for metrics monitoring?
1
Just curious, is an RHSCA certificate worth pursuing?
1
Just did a bit of quick reading on this. It looks like they wouldn’t be subject since they don’t currently have a presence in Colorado. Apparently if you’re employee number one in the state you get screwed cause you’ve no one to have equal pay with?
1
Just do the below two things, till you have things under control
* Don't start too many things at the same time.
* Begin by automating the daily repetitive tasks and keep improving them till they work without any issues.
**Disclaimer:-** This is applicable to your case only (based on your post above) and my recommendation may differ for another person depending on his requirements.
1
Just don‘t, without prior experience. It will eat you alive and you won‘t be of any help for the business. Took me 10+ years to collect the needed experience to do my daily tasks. And I still live for working, not the other way around.
1
Just got promoted to mid but I lack confidence and experience building pipelines from zero, got into big clients since day 0. Do you have any recommended challenges I can do by myself (not only related to pipelines)? Also any advice is well received.
1
Just make context switching notes where you have to teach a junior or mid everything you know in a few paragraphs and diagrams for each language you want to maintain
1
Just make sure to follow 3-2-1 backups for your data. 1 copy on AWS, 1 on off site storage and 1 in the hands of the Chinese government.
1
Just migrated away from it
- goes down frequently
- poor support for monorepos
- couldn't run arm64 runners
- tickets/ bugs stayed open for years and closed source so you can't fix them yourself
You get what you pay for..
1
Just my personal opinion, but for me "DevOps" is about solving business issues with technical solutions. Having a reliable CI/CD pipeline to test code and deploy very often is pretty much only there to satisfy the business requirements of pushing new features and ensure uptime, because less uptime equals less money.
To "learn DevOps the right way", you have to work for a couple of years and witness such issues. You have to be in a zoom call or conference room with developers, sysadmins and managers that are arguing and trying to blame each other for the last incident.
That's why, I think, a lot of people in this sub say that "DevOps" is not a junior role, that you can't be a DevOps Specialist right out of school.
You're on the right path seeking technical concepts. You should also try to gain experience and surround yourself with the right people that will tell you why things are a certain way.
1
Just own up and apologise. We all f**k up now and again.
1
Just publish the container image of your account's ECR and allow their account to pull from it?
1
Just pull a record out and play it to configure your server!
1
just share it and let us be confused
1
Just so I've got my head round this, would cloudflare work in a similar way to a Heroku addon such as [Expedited CDN](https://devcenter.heroku.com/articles/expeditedcdn) where it would sit in front of the app and (front end and api in my scenario) an essentially manage traffic?
1
Just start doing a bit of training at night and on weekends. That feeling goes away faster than you think.
1
Just to add on, although not in an ELI5 way, the Switch in the house is known as a "walled garden". It's an easy to implement security control that at least adds an edge that people must cross before accessing what is in the walled garden. In this case the edge is a NAT, a WAF, some kind of proxy, etc...
One downside to a walled garden is that things inside have a tendency to trust anything else in the walled garden. Think of it like your home network. Most people let all of the devices on the network talk to each other because once you're on the network it's assumed that you're trusted.
There are more advanced security models that don't rely on a walled garden and instead require every person and service to be able to authenticate itself and establish trust with something like a signed certificate. But this sometimes harder to do than a walled garden, especially if you're deploying third-party tools that may not support certificate-based authentication (or something similar).
tl;dr keeping private systems private, and then separately exposing only enough access to public services, is what is generally means by using public and private subnets. Once you've learned how to implement this model, there are other security models if you care to expand your skillset.
1
Just to make sure that I understand it.
Even if I have a server with only 80/443 open in a public subnet, it could be used to try and access other services in the network?
In the case of using load balancers, the load balancers themselves would be on the public network and the services on the private, so if the load balancer gets invaded the services in the private subnet would be safe?
1
Just to piggyback off this - has anyone had any experience doing an IaC approach to AD DNS?
1
Just use [AWS Amplify](https://docs.aws.amazon.com/amplify/latest/userguide/welcome.html) It's beautiful for your use case. You can add lambdas, hosting, authentication + 2FA, custom domain, certificate validation and such.
If you want more control over the backend, you can use [AWS SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html) or [AWS CDK](https://docs.aws.amazon.com/cdk/v2/guide/home.html) to provision additional S3 buckets for stuff like profile images and other assets... And at max a PostGRE RDS instead of Dynamo DB if you like.
I usually use AWS SAM + Amplify in my projects which have mobile apps + web + Lambdas + S3 + RDS. You may feel the learning path is a bit steep, but Cloudformation is just nice and can help in abstracting devops away from your code :)
1
Just wanted to add that this has worked great thanks, Helm GitLab runner connected up easily via the auth token, just got a bit of faffing and tweaking to do but it's executing my pipeline stages with multiple images, lifesaver!
1
Just Windows but I know of at least one team that has a bootstrapping script that's intended to be cross platform though I haven't touched it personally.
1
K8s is nice but it's a lot to do in one go.
It sounds like you're up to your neck in break/fix work so it would be better to do some of the k8s prerequisites and see immediate benefits first.
I'd focus first on CI - you want reproducible server builds so you can see all the server config (self-documenting!!) and terminate and recreate misbehaving servers at a push.
I'd recommend that the CI pipeline create docker containers and push them to your repo.
Next you can do simple CD - servers that spin up, run 'docker compose' - which pulls down your docker containers and runs them. You can create systemd service files to do this for you.
Next I'd move to IaC - it won't confer as much impact immediately but over the long term it will reduce the amount of time you spend debugging misconfigs in your infra and should be essentially self-documenting.
When you've done all of that, you'll be well-placed to migrate into kubernetes, and should have the headroom to do so.
1
K8s it’s great getting my CKA made my marketability sky rocket. There’s a lot more to successfully running a k8s environment in production than the cert suggests but it’s great.
I also find k8s enjoyable even if it is a lot sometimes
1
Keep on using hammers to every problem you encounter. Thanks to you ppl like me have a premium job.
Terraform is an orchestration in itself.
1
Kinda curious, why use Nomad or ECS instead of k8s? Everyone has a managed k8s service, and there are a lot of tools developed with k8s in mind (like operators).
1
Know this very well. Back in 2012 or so I took down the whole core of the US divition of an European multinational by shutting down by mistake the wrong con tunnel (one connecting two core sites instead of a customer's) It took a couple hours to notice what was causing lack of connectivity so after all it helped us improve monitoring and event reaction...
1
Kube is extremely beneficial if your codebase/developers understands how microservices work. However it adds more complexity to the environment especially as the system grows.
Monoliths work well but are slow and not easily modified.
Kube allows for quicker modular replacing of components but only if developed properly.
I'd say yes 100% because the concepts you will learn are just generally good concepts and you'll be forced to understand namespaces, network segmentation and a different type of architecture
1
Kubernetes controller that enables timed resource deletion using TTL annotation: https://github.com/TwiN/k8s-ttl-controller
1
Kubernetes is not as hard as most people make it out to be.
To be fair, the pain of K8s is managing the cluster, particularly the control plane. But that is becoming less of a thing now that you have GKS/EKS/AKS as hosted versions that you only pay ~$75 a month for. It takes the headache out of K8s and allows you to just reap the benefits.
There’s still stuff to know for sure and there’s tons of optimizations. But it’s not any different or harder in my opinion than all the other things we do in DevOps.
1
Kubernetes is overkill for almost every scenario I’ve ever seen it used for. That said it pays well.
Sounds to me like you have either physical machines or vms of some kind deployed to a cloud provider. In which case I would say you might want to look at ansible. Fairly easy to build an inventory and then you can run shell tasks by group.
1
Kubernetes isn’t always the right tool for everything. Especially if you’re new to all the tools, Kubernetes is a bummer to start with.
As it sounds, you just can’t move your application to Kubernetes and it works out of the box. You might need to adapt your application slowly to a microservices architecture. Your cron for example could be a separate service, so it doesn’t fire three times when scaling the application.
I would start with Ansible first, though. Use it to build a „restore from backup“ playbook that lets you sleep better at night. If you‘re confident on the command line, you can just build a playbook that fires your well-known commands. This way, you‘re getting familiar with it and build confidence to refactor it using proper Ansible modules.
If your final goal is to go with Kubernetes, you will probably don’t need Ansible that much in the future. But it won’t hurt knowing things around it.
1
Kubernetes works just fine for a monolith with good horizontal scaling.
You're offloading the scaling/health/uptime/deployment component to Kubernetes and away from your own orchestration (i.e. packer + AWS ASG + whatever you use to deploy).
If you're running containers, it's a no-brainer compared to something like ECS.
If you're not using containers, it's only worth dabbling in it if the end goal is containers.
1
Late to the party, but this is the big thing for me. Terraform CDK uses the FUCKING MASSIVE provider library that pulumi will take years to catch up to, if ever.
Pulumi is great from the app developers perspective, but if you have an infrastructure footprint larger than an app on an ec2 instance, use Terraform proper/Terraform CDK hybrid.
1
Learn a programming language like python and skill up on bash. Hopefully you're getting exposure to the inner-workings of the cloud offering as a support engineer. Dig into problems and really try to understand the root cause and fix for any solutions. Just getting the hands-on experience will help you learn the stack.
Once you're comfortable with a scripting language + bash and have a familiarity with your company's technology, focus on CI/CD pipelines. Find out which technologies your company is using for their pipelines/testing/deployments and learn them in a homelab.
1
Learning K8s and getting certified is my next to-do. I’ve heard nothing but great things from K8s practitioners.
1
leave it, you guys have completely gone the other way!
1
Leaving the "lessons" of the Phoenix Project aside, I think to have some perspective on the narrative part it helps to have read the book it's modeled after, The Goal by Eliyahu Goldratt.
The story of The Phoenix Project is a rewrite of a guy turning around *a factory* in the 80's (and even then was written like it was from the 50's or 60's) using the proposed 'theory of constraints'. Yes, both are simplistic stories with simplistic characters (charicatures, even). They're supposed to be.
The usefulness of the 'theory of constraints' can be debated, but what I'd like to think it does it get a company that's averse to change to really take a look at what's holding them back and why and how those things need to change.
1
Let us take this offline. I can guide you and help you manage this the easy way.
Most of the suggestions are good, but they don't fit your needs. (At least what I understood so far)
Because the end user is not computer savvy, it is important that the solution you choose is clear, simple and delivers what needs to be done.
The simplicity will enable you to pre-empt and resolve potential issues by adding validations.
It should also enable you to quickly understand the problem in case you missed any scenario.
What I mean to say is you will also be responsible to provide support if your runs into unexpected issues and hence, your solution has to be simple and easy to understand, code and deploy.
1
Let's wait for someone running here explaining that they force limits by either namespace limits or OPA coz they dont trust their developers or that it's required by business because its the only way they can force isolation in multi-tenant architecture.
Yada Yada.
1
Like you said - it may require quite a few tweaks before you have a working sample.
It's the same for all the other tools available on the market. Fortunately Cloud Engineers have not been outomated out of the job yet.
1
Link away!
1
Linting and formatting should happen on the dev's machine (automated!). And a git hook should prevent anything being committed that is not correctly formatted.
1
Literally facing this right now, go from 90 to 120k and an interesting job but go from remote 5 days a week to 40 min drive to the office once or twice a week. Tough choices
1
Literally never would support outsourcing. You can’t pay me to go against my own principals.
1
Literally spin up minikube and deploy some services. Couldn't be simpler.
1
Living on the edge.
1
Lmao as if most places care what FAANG are doing
1
login to the private repo, pull an image, rename it (use -t to tag), then login to the gitlab repo and push it.
if space isnt a concern then pull all the images at once.
1
Loki seems like exactly what I need! Thank you!
1
LOL deal with it.
1
Lol gotta love the condescension of pedantic redditors
1
lol technically you are right, they solve the same problem. But then we are not answering u/OP's question. My intention was to ensure we are also answering OP's question :-)
1
lol thanks for answering my silly question
1
Lol this one reminds me of someone we fired because he was awful and making a lot of mistakes, few weeks later we had some accounts closed emails for suspicious activity. Turns out he clones and repo and it's whole state file which contained all the upstream sensitive jnfo from core plans via a data, and put it on a public repo and even kept our company name in the title...never have I seen such stupidity.
We reached out both to GitHub and him and he got a 1 day notice from git to remove. He then told us it was removed but all he had done is rename the file...such a clown lol he claimed it was accident but I don't see how someone could be so bad
1
Lol, this. Spaghetti documentation to go along with spaghetti code. If it's going to take me half a day to read the documentation I'll just reverse engineer it
1
LOL!
I can show you completely unreadable TF code in no time. And it’s in production and the “blessed” way to do things. Completely unreadable, right n my opinion.
1
lol.. okay. I will stick to my nonsense :-)
As they say, never let go of a milking cow.
Good luck with your sensible answer. Please share the results :D
1
Look at what AWS offers, older versions, major/minor/patch - that is just how it is. If you want the latest version, you are going to have to roll it on a platform yourself. [https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html)
1
Look for a large IT services provider, a company that has everything from desktop support to full IT-as-a-Service. Start with desktop support and an eagerness, showing off skills by automating parts of your job (or parts that haven’t been automated but you come in and save the day!).
Another option is QA. One of the most talented DevOps guys I’ve worked with has a Masters in Social Work/Counseling, something like that. He came in via QA and just started learning, automating, contributing without being asked but in a positive way. Eventually he moved up the chain into a full developer role but on the DevOps side.
1
Look through Azure’s documentation. They give a lot of good guidance on building out their clusters and they’re fairly easy.
1
looks awesome, thank you!
1
Looks like a cool tool but there is no license listed. I’d be hesitant to use it at work without knowing the license.
1
Lot of startups do this, call it clickity click or clickops
1
Lots of good suggestions and advice here. Two important points that I'd add:
**Don't neglect your current SLA***.* As the saying goes, you're rebuilding the plane mid-air, so you'll want to make sure the changes you're making to your tooling won't interfere with your ability to ensure uptime/availability of production services. IMO, this is an argument for more incremental changes that fit in comfortable with your current manual workflows (e.g. introducing some automation like Ansible and monitoring/alerting).
**Don't agonize over technology choices**. Everyone has their pet logging platform/automation tool/whatever, so don't worry too much about which one is the very best. Trust your intuition here about what seems to be the best fit for your use cases, and I'd wager that you'll be right more often than not!
Good luck with your journey!
1
Lots of Windows/.NET shops use TFS/ADO exclusively. It isn't going anywhere.
1
Louder for the people in the back: DO NOT SAVE SECRETS INSIDE IMAGES.
1
Love it.
1
M3db is also an option alongside clickhouse
1
Maddening - it would still be easier to package it up into a cheeky FaaS app that dumps this reporting out for the bookkeeping people - or better yet, creates some connector between the bookkeeping software and AWS that automatically natively reconciles the AWS state and bookeeping database.
1
Made a monitoring shell script using for loop. Saturated the complete production LPAR'S in terms of cpu.
1
Make a base containere and build the script into that, then all your apps can build upon the base containere. We have used that workflow for keeping standard scripts and configurations consistent across applications. We rebuild base containers for all languages every night and deployment pipelines for each application builds on the newest available base containers for each deployment.
1
Make it two. If your budget allows, maybe a 3D printer.
Important TIP: whatever you do, do not let him start with home automation or building his own mechanical keyboards. There will be long nights and you will stay poor forever.
1
Make sure you are actually running the OSS version. For a long time (still?) Drone was publishing the enterprise version in DockerHub.
1
Man, do I see myself in you. I’m 40+ now and have operated that same way.
Take a look at this project for a really cool example of how Ansible and Vagrant can work together, I really think you will enjoy it.
https://github.com/ChadDa3mon/infra-ansible
1
Man, I am looking forward to (hopefully) passing the CKA later this month so I can casually drop that into my resume.
1
Managed to green light a release that shut down device registration for the second largest TV streaming platform in the US, during peak viewing hours on the night the CEO was participating on a startup panel with Mark Cuban. The outage was all over Twitter and the queue took 2+ hours to drain.
Moving a VM with 10+ years of critical financial data and somehow failed deleting the VM on both machines. Thank god for Backblaze. Saved my ass with little data loss.
Plugged in a daughter board incorrectly and burned up a professional studio grade sound card as well as the computer. Ram and processor were saved thanks to protection that Intel had on the bus.
1
ManageEngine Password Manager (formerly Zoho) is competitive with Thycotic Secret Server and the licensing is better IMHO. I like the UI and experience a bit better too.
1
Management ordered everybody to read the book and then installed a rented arcade game in the break room for a week.
1
Many people will tell you its a security measure. that is wrong. You can port forward to private addresses and be just as insecure. (or more commonly, get something downloaded to the machine from a bad update, etc)I think some people think that having a public IP means you must have just some old router, and no firewall. :) Just because you have a public address does NOT mean that you have to allow people to access it. In fact, it makes it very hard to spoof you, since your IP's are routable. it also makes things like site to site VPN's much easier, when there is no conflicting private address spaces.
Also, note that IPv6 does not have a concept of 'private' ip's, only link local.
1
Many, many companies have "DevOps" which is primarily focused on "ops" - I'd find one of these companies.
Some people just aren't devs - they're just as intelligent as devs, and can even do technical work, but they just don't have a capacity for that kind of abstract reasoning.
I am very good at spotting how such people think and speak, and you sound like you are one of these people (and it's clear you know that). When people just have that gut feeling that says "I'm never going to understand development" they're usually right.
It's actually more common for "DevOps" teams to *not* write any application code - you'd be able to find the kind of work that suited you in many places.
1
Marking this for later
1
mastermnd has awesome bootcamp playlists on YouTube, try checking him out
1
Match your prep plan to the job description, if they use IaC learn enough to answer what the benefits are and how it works. If they use containers try to do the same thing. If they want someone with experience you likely won’t get it, if they’re open to training up someone then it’s possible they like you as a candidate and you don’t have to feel pressured to know everything going in. It really all depends on what they’re open to. Good luck.
1
May I ask what is the advantage of running migrations from the app as opposed to running them on the scripts separately? I fail to see how clumping the migration and the app deployment together is in any way better. If anything, devs will think about the implications of succeeding migrations but failed deployments even less.
1
May someone already suggested this. InfluxDB is a good choice too.
1
Maybe , but not quite the extent you are.
I've developed apps in 15 languages or paradigms over 35 years and still find myself trying to remember some basics.
The more languages I've learned and the more the paradigms change the worse it gets.
The good thing is the correct answer is only a google away. I find that I never forget the basic concepts that make finding the correct answer easy.
Don't get stressed out, apply yourself to re-learning the current tech.
1
Maybe not what you really want or need but Elasticsearch should fit in that budget for a cloud subscription plan. Not sensible to ignore one of the major players in a market without knowing a reason for not choosing them
1
Maybe remove the resource limits for CPUs (See [Stop using CPU Limits](https://home.robusta.dev/blog/stop-using-cpu-limits/)).
If that does not change anything, you might want to look into the application itself, because even with only 10 VUs, the performance is quite bad (see line `http_req_duration` in your first test - a median response time of ~0.5 seconds).
1
Me neither at the time, which made debugging the sudden failures of random programs...challenging.
1
Mechanical keyboard
1
Mechanical keyboards are so weird. There is no real reason to like them, they're like the nick cannon of keyboards.
1
Meh, I’m a manual/RWD/6cyl elitist I guess
1
Message me. I can see who my company uses in your country. Can’t guarantee how great the contracting company is to work for but I like their employees
1
Microservices were the bane of my existence. I was asked as junior dev to implement a lambda for a micro-batch job we had, since no one had ever tried it. I did it so well that I implemented 4 more. Then they were promoted from dev, to staging, to prod. So now I had 15 Microservices instances to maintain. And we were using multiple regions, so then that multiplied by 3. Then some clients were pushed to another version of our product, but not all at the same time, so I had to deploy another 45 lambdas....
1
Might as well be called KubeOps these days, imo.
1
Might be a good idea to ask your new company if they have an A Cloud Guru or similar subscription (or buy one yourself) and get cracking on some kube fundamentals.
They obviously hired you for a reason, don’t stress and ask for help.
You might find another engineer (software/dev/doesn’t matter) can spend 30 mins running through the platform at a high level and you can go off and learn what you need to. Maybe shout them lunch in return 😎
1
Might not matter in this case but timescale isn't completely open source
https://www.timescale.com/legal/licenses
1
Migrated a Kafka cluster from one broker to three without appropriate replication configuration and topics (data) in new cluster locked as they were corrupt. The whole platform didn't process anything for the better part of 12 hours... I can't know for sure how much that was in money but it was in the millions...
1
Migrations in a transaction will fail, but this is a super bad idea. Run your migrations prior to deployments of the new app version. In the CD pipeline. I dont know why multiple instances of flyway have to be running at the same time, this makes little sense.
If you mean that the app instances themselves are using flyway to migrate .. move away from this immediately. This is the absolute wrong way to handle this.
1
Mimir if you want new and shiny and Victoria metrics if you want long term Prometheus style storage done well
1
Monitoring has entered the chat.
But yes in terms of basically being able to execute some code in response to requests and store some state that's probably about it.
1
More accurately, leetcode exists because of and is modeled after interview processes like Google's.
1
More fucking blog spam. Check OP's post history, probably a bot, or has a financial interest in the linked blog. When is this sub going to get better instead of worse?
1
More like he realized he's talking with a contrarian and puts a stop to wasting their own time.
1
Most frameworks I've ever worked with when I was still back end included generators or fixture models for the PDO.
So that's what I always used. But if you don't need anything fancy, just popping a couple `.sql`'s into the testing dir should do fine.
1
Most modern text editors/IDEs will let you edit remote files, so you shouldn't need to package a desktop environment/gui with anything.
I currently use a Docker image that's based on the full-fat ubuntu LTS image so it has all the common cli utils. I also install and run ssh server in the container so my text editor can reach inside to edit files in the container.
1
Most organizations that think like this and blindly ape FAANG orgs fail to understand the basic truth of those companies; every single one of them has a literal army of top-flight, best-in-the-world engineering talent to throw at any problem that should arise. They can afford to throw dozens of engineers for an indefinite period of time at any architectural complexities that arise from their tooling; most startups cannot.
What is architecturally "scalable" for a FAANG company may not necessarily be "scalable" for a startup with 15 employees (Maybe half of which are even technical minds). Startups should always plan for scaling but they have to work around the realities of their limitations.
1
Most people already do IaD I believe and still call it IaC.
In my understanding IaC is something like Ansible where you describe actions to do.
IaD is a declarative file like a Kubernetes yaml manifest and you don't need to describe what to do with it (GitOps approach).
1
Mostly I use Terraform. But I have some Kubernetes workloads with dynamic services/ingress management, for this I use ExternalDNS.
1
Mostly PowerShell, Bash, and Python.
1
Mumshad's udemy course is great if u want to learn Kubernetes in depth. I passed CKA last year by taking his course. Though needed to practice a LOT
1
My answer would be "it depends."
Depends on how many people are in your team and what the quality of the engineers in the team is. Depends on whether leadership understands that using newer technologies does not pay off immediately. Depends on whether leadership is willing to pay more for engineers; those that work with modern stacks cost more. Depends on your current environment; how many dev teams, how many applications, how many vms, how much traffic, etc etc. Depends on how fast the company is expected to grow. Are you going to deploying 1 new application every year? Or perhaps 20 per month?
Answers to any of these questions could be the difference between using ansible or building a self service multi-tenant kubernetes platform in the cloud.
I know I am going to get grilled here because "consultants bad", but if you and management don't know where to begin looking, I'd recommend hiring some external expertise. Not someone to design something per se, but someone that can help you answer these questions.
1
My bad. You can try [Terraformer](https://github.com/GoogleCloudPlatform/terraformer), still active and does the same thing.
1
My Ben is hurt
1
My boss didn’t know the difference between volatile and non volatile memory…
1
My boss didn’t know the difference between volatile and non volition memory…
1
My company uses GitHub. We used to use gitlab.
Gitlab does workflows and jobs better than GitHub imo. It's nicely visually organized, and can be kicked off as needed.
Since we already had a ton of scripts and workflows for gitlab, we figured we'd just run our own.
Company mandated we use GitHub - so we do, but for any repos that we used gitlab's workflows often, those get synced to the gitlab server.
1
My daily 'programming languages' are bash and yaml ;)
1
My experience was that it's worth reading. Maybe it sets you up for a better mentality (ie. doing what's right by the company instead of holding pissing contests between other engineering teams.) But in my opinion is that most of it is theoretical time waste besides that.
1
My feedback, we already have vasts amounts of content out there on YouTube, Medium, paid online learning and advertisement driven sites already.
I don't think the issue is the content simply doesn't exist or we somehow need more of it.
What we do need more of is mentorship over cookie cutter content and time for learning for the enjoyment of understanding how things work. We need to foster a culture of wanting to learn over wanting to tick boxes to make you employable.
Unfortunately the trend has gone towards people wanting to get into the industry as quickly as possible to make maximum buck with as little learning as possible required. So if you're going to do anything, do something that makes people want to learn more.
1
My first experience with The Phoenix Project was being introduced to it by my senior and executive leadership group at a company of about 400 back in 2015. They were motivated to implement DevOps in the org and purchased copies for all of Ops, Dev, Security and QA.
Since we had the full support of the company's decision makers, the transition/transformation was a resounding success. Apps were rewritten, we introduced containerization and later Kubernetes, we created the first true CI/CD pipeline in the company which deployed from Dev all the way to Prod (incorporating approval gates, security requirements, Change Management, etc). Processes were exponentially faster (where automation replaced manual tasks) and virtually everyone was able to operate more efficiently, as a result.
On the flip side, I worked at an organization with 14,000+ employees and tried to get upper management on board with making the Phoenix Project a cornerstone of the DevOps transformation that they hired me to drive (there are many books that can serve this purpose, but because the way the story is told is very relatable to engineers in the trenches, it usually resonates a bit more, from what I have observed). Long story short, we did not have the support of upper management with most of our proposals, whether recommended reading or technology suggestions based on successful POCs.
Moral of the story is that, in my experience (and YMMV), the Phoenix Project is an excellent conduit for understanding the value of DevOps by presenting it in a relatable story format, rather than an more academic approach like some of the practical implementation books. Which style is better will be based on the individual, but I'd say The Phoenix Project works well for those who have little to no exposure to the concepts. That being said, without the support of senior and (key members of) executive leadership, it will be difficult to achieve the type of transformation required to go from traditional practices to modern ones (from what I've seen)
1
My goal is to stop whatever (until now I thought it's cloud-init) is messing with my resolv.conf :)
1
my hot take about this is that there's a lot of generally solved problems in software engineering and there's not really a need to reinvent the wheel when you don't have to. not every decision is made because of a cargo cult; in fact I feel like there's more not invented here syndrome in tech than not invented there syndrome. in many ways I'd prefer the startup that picks k8s over Nomad bc FAANG than the startup that reinvents k8s, but worse, bc of NIH
1
My last client did kubernetes but my team did primarily pipelines. I don’t have a lot of experience in kubernetes but wanna get some.
1
My last client used bitbucket with jenkins and is now in the process of migrating to Gitlab which is MUCH better.
1
My mistakes were mainly political. Who knew C-suite would get upset about town hall questions over the funding status of the startup I was working at?
1
My position on that is that in general, your startup isn’t interesting enough for me to put up with FAANG style interviews.
1
My post may have been a bit inaccurate, as I have already spend a year and a half feverishly studying linux, networking, python, Terraform, boto3, etc. Although if I can't find any Devops jobs, I will probably go the entry level route.
1
My ranking (having used all but one of these at scale) would be:
Grafana Mimir > InfluxDB > TimescaleDB > VictoriaMetrics > Clickhouse
1
My rankings here are approximating their suitability and ergonomics, but each of these should have benchmarks published on their README.
1
My suspicion is that you might be under-qualifying what I know and what I do. It matters not though, I appreciate that you shared something I was unaware of. Slow release cycles of EKS, AKS, and GKE are unacceptable without some form of backporting. To be completely fair I would hope that the vendor support from Amazon, Azure, and Google have some way to proactively handle zero day exploits. Though I don't have that level of trust at my disposal having been in the industry for so long.
1
My team are trying to implement something similar with filebeat reporting time series logs to a 6 node kafka cluster in front of a large ELK stack. It's currently struggling at about half the load your asking for so I'd suggest avoiding it!
I've had some success with InfluxDB but at a much smaller scale, glad to hear others recommend it as being capable of over 100k updates per second but I imagine may need significant hardware to sustain that kind of workload on a 90 day cycle.
1
My team has brought on several senior employees that weren't up to speed with tasks and the tech for \~3 - 6 months depending on complexity.
One thought on the company culture - good companies will take the time to mentor and train you in their tribal knowledge. Bad companies will not. Good companies won't fire you for making mistakes or taking a long time with things as part of your learning process.
I highly recommend a cloud guru for their hands on labs, and for k8s, nigel poulton has a course on Udemy ($20) and pluralsight ($ub$cription) that is absolutely fantastic.
One last thought, it's better to over communicate with your team and manager than wait to be 'found out'. Let them know where you're at and what your plan for learning and growing is.
1
My team is supportive though they suggest/push me to do more operation work or automation stuff like write simple tests for services.
I can ask for a developer-related training however still i am not sure if its something i want to do as work since it can be quite hectic, add way more stress in the daily work life instead of choosing a different path in the company...
I like small system administration tasks or troubleshooting but not getting in a development hell. I have seen one of the most talented members of my team taking the biggest blow of all of us so i am not thrilled to be in his position. Considering that i dont like developing.
I like a lot work life balance with a emphasis even on life more than work :)
1
My work is migrating jobs from Jenkins to Github Actions. We get nowhere near the free tier limit for Actions runtime minutes for now. $35/user/month for private organizations is not super cheap, though.
I wish Jenkins actually had more JVM features available rather than its narrow subsets groovy lang.
1
My zone files are managed with Ansible, but I guess running your own authorative nameservers with bind is not so fashionable anymore.
1
Nah, Azure DevOps isn’t going away any time soon. Too many enterprise customers on it. Chatted with some of the MS guys about it a while back. ADO is great if you want a one stop shop of tools, end to end traceability and you’ve got MSDN licenses and stuff.
Honestly we just roll the incremental costs (extra pipelines, etc.) into our Azure sub and don’t even worry about it.
1
NAT is not a security-feature and you can have the same security-measures regardless if using public or private addresses. When IPv6 gets adopted properly all will have a unique public address
1
Neither is Node you numbnut and you haven't mentioned that.
1
Neither. Just use native APIs and coding language of your choice.
1
NET Core (product, core business).
Go & Node for tooling.
Would like to learn Rust.
1
NET Core isn't a programming language though.
1
Never really used it but what are peoples thoughts about Argo Workflows?
1
New to clickhouse but can already say from a sysadmin perspective the clickhouse story is fantastic, using the DB itself is also a pleasure
1
Nginx is a common choice. You can also check out Caddy or Traefik.
1
Nginx is exceptionally light weight, I'm not sure nginx could be considered overkill for anything!
1
Nice man! I Will have a look at this!
I was interested in building something like this to manage a few containers.
1
Nice sounds like they were trying to entice you in 😉. Mine was an M4 competition, they sound pretty nice with the pops and bangs, I think I used a whole tank on accelerating and letting off just to hear the noises haha.
1
Nice try Putin.
1
nice, quality post and discussions. Can someone either post a link to blog/video/article on timeseries database please? like, what is special/different than a vanilla rdbms, what is it used for? Thanks in advance.
1
Nice, thanks for the tips! I wasn't familiar with `cfn_nag` and `taskcat` looks pretty cool as well.
If you filter on [is:open](https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/projects/1?card_filter_query=is%3Aopen#card-84110708) there are only 9 items in the "Shipped" column and over 300 in the others. I've definitely seen scenarios where Terraform had support for resources before Cloudformation did.
1
Nigel p.
Nina’s tech works
Mumshad from KodeKloud
Victor Farcic
You can learn a lot from these people
1
No
1
No buddy, not at all.
My goal is to switch from current duties to something more creative, I'm heading to machine learning eventually, the switch is not that easy at all. I have to do a LOT of studies, but I need some experience too. I found DevOps to be the golden choice for me, so I decided to work on it, I should be a DevOps engineer in 1 year. It is a great career, new experience, I will learn new things that is also relevant to software development and ML.
TBH, I was expecting an interview at all, I applied for the job hoping they will send me an online assessment so I can try it and assess myself to know on what ground I'm standing on.
Yet, they want a meeting, and I will not let this slip away from my hands and regret it later. I will do my best.
1
No documentation at all. Neither in the form of code comments nor as a wiki or similar. Every bit of information is shared verbally. If the head of the it department had an accident or cancel his contract. The company would have a huge problem.
No idea, my guess "reddit".
It does support also influx tcp protocol depending data ingestions your are ok with.
My suggestion to check it out seen if it fits the bill. At the moment doesn't have "cluster mode" so HA or fail-over would have to be implemented by operator.
1
No offense but what exactly do you do at work if you dont know any programming?
Are you basically clicking around in UIs other people programmed?
1
no offense taken.
Well i havent make any actual serious work so far apart from writing some documentations about UIS and Kubernetes and deploying stuff or writing some very very simple tests in python on pushing code that other people helped me write on the git
1
No reason not to, but this is also a nice way to provide a standard dev environment to your team which is also nice.
Either is fine.
1
No reason.
For IPv4 the reason is we’ve simply run out of public addresses. So we’re kind of forced to.
In v6 one should always use public subnets. They don’t even have to be from ranges announced to the internet. But no point risking some overlap / address conflict. Use unique global addresses everywhere.
1
No way you're gonna be up to speed in 7 days and able to answer probably very specific DevOps questions with basically no knowledge at all.
Sounds to me like you wanted a high salary with low efforts with no actual clue of what the frick you're doing.
1
No you aren't you spammed that shitty site all over Reddit
1
No, even in the doc you linked - minor version. The release is a minor version but it is still a version.
1
No, I don’t know this page
1
No, no sarcasm at all.
Setup 3 servers with the help of Ansible playbooks to provision them with your tools and Docker, extra points for logic with distros/env variables.
Run Jenkins on docker on the test env.
Setup the Jenkins to test and build some random software or a project of yours.
Once it works and can be packaged make a stage where you package it and send it to the "prod" server, in there repeat last step (have Jenkins and a prod pipeline).
Setup prod pipeline to deploy the package.
This is more or less the high level, there are a lot of steps not mentioned but following this idea you learn a lot and find ways to do stuff. Ofc there are other tools than Jenkins but it makes it easy to glue stuff on the fly, where the others need some specific config to be well done since the beginning.
1
No, sensible doc structure does exist and the search engine isn't the main issue. People like to blame Atlassian and other platforms when the real issue is companies skimping on IT, recklessly outsourcing, and treating technical writing skills like an afterthought.
1
No, you have *different* ways.
This is just one solution. If it doesn't fit your work environment that's totally OK, other people might find it great.
1
No. Do they want to know about each invocation? Or just the function (which does nothing without being invoked)
If it doesn’t have a serial number, it shouldn’t go in the cmdb.
The terraform is your record.
1
No. Hard no. Would rather flip burgers.
1
Nobody here is actually ELI5.
Imagine you have rich parents who bought you two Nintendo Switches.
One Switch you carry around with you when you're going out. You're very careful with it. You keep it on your person at all times. You keep it in a carrying case. You'd have to be mugged for it to be stolen! So unlikely so why even worry about it. Then one day you leave it on the table at McDonald's while you refill your drink and someone runs by and steals it when your back is turned.
The other Switch you keep at home hooked up to your TV. You're not so careful about making sure it's safe as somebody would have to break into your home to steal it. It's in a trusted, safe place. Nobody from the outside world can easily get to it.
\----
The first situation is like securing resources in a public subnet. Yes you can do it but you have to be a lot more diligent about it.
The second situation is like securing resources in a private subnet. There's a whole attack surface that's been eliminated
1
Nog programming languages, but lots and lots of yaml and toml files.
1
Nomad also supports other kinds of runtimes, which can be useful for hardware testing.
1
None.
Python, PowerShell, Bash.
All shell/scripting.
1
Nooo. That will be equivalent of introducing first shot of heroin at this level. The guy reads RFC’s for lulling himself to sleep, what do you think will happen if he discovers the 9th wonder of the world?
Joke aside, it is a costly hobby. But definitely worth it.
Source: I am u/lightwhite and I am an addict. I build mech keyboards for zen.
1
NoOps -- clicking buttons
1
Normally, your container registry has functionality to do so. A retention policy you can set and a GC that runs.
even the bare bone docker distribution has a GC.
1
Northern Virginia. Basically right outside of all the tech offices here.
1
Not a junior but moved from more of a cicd background to SRE and the whole montiring stuff has me bamboozled.
Like I feel like they just assume I should know everything about them because it's standard for a SRE to know.
But severely lacking knowledge on loki, grafana, prometheus, metricbeat and filebeat.
Is there a one stop shop for me to kinda skill up on this stack all at once?
1
Not a language but YAML and Jinja templates on a daily basis. Any *sh (bash/zsh/csh) here and there. I'm not a python dev but I fix or adapt python modules for my need (usually for Ansible). Lately giving some time to Go but haven't had the time to dive deep.
Outside of DevOps, C for my electronic hobbies and Javascript to mockup web ideas or do some "http/web" testings and load.
1
Not answering the whole question. Too many points to cover.
Answering just for
`I guess the point of this post is that companies should optimize for where they're at currently in the present moment rather than where they think they'll be at in the future, barring foreseen scale out events that might take place within a week or more, or doing something just because every other company does it without rhyme or reason. What do you think?`
**Working with the focus on where you want to be in the future, allows you to monitor you (learn about) your process / choices and keep making useful (meaningful) changes to make them effective for a stable operations.**
A lot of companies that were extremely successful as a startup failed in the growth phase, because they did not plan for (work towards) the end state.
Hope this helps.
1
Not anymore. I can count on one hand how many scripts I’ve written in the last 5 years. If you’re scripting a lot you’re not setting up your infrastructure correctly, using the correct tooling, or you have over engineered your solution. Currently work at a fortune 100 company and have done dozens of cloud migrations previously , for the record.
1
Not as much as some would have you believe. I no longer remember every method that I’ve could use on arrays in JS, Python, and Java. I forget some things with syntax, but I when I need to code it’s not a huge challenge.
1
Not being able to explain it means you don’t have any kind of strategy
1
not bitbucket, their CICD sucks
1
Not contesting any of that but it’s surprising how many people can get by with ES in time series use cases similar to how many places I’ve seen get away with using basic Postgres like it’s a TSDB. Sometimes it’s worth the primary costs of licensing to avoid secondary costs of training, oftentimes it’s not.
1
Not DevOps myself, but I've seen many a coworker be proficient in
python,
most shell languages (bash and PowerShell specifically),
and at least one C++ spawn (Java, C#, Rust)
Although I think the last one may not be used on the job as much, it's still super useful to learn when working in that environment
1
Not disagreeing with many of your points but adding to it, Pulumi has a declarative language now:
https://www.pulumi.com/docs/intro/languages/yaml/
1
Not giving access to our board to our "scrum master" will result in me getting fired and replaced by someone who will.
1
Not having an easy to use tool to create the documentation. If it's not dead simple to write and share documentation, nobody is going to even bother writing it.
1
Not in this installment.
1
Not including the salary range in a posting is a red flag for me. I don’t even entertain a listing if it doesn’t have a range posted, and “what’s it pay” is always my first question to a recruiter. I frame it a bit more diplomatically as “I don’t want to waste your time or mine so I need to know of the pay range is acceptable for me.”
Frankly, I think all of the job sites should make it a required field.
1
Not lying to the candidate. Especially hard if you are there together with a manager who says to “zip it” when candidate asks about amount of technical dept we are struggling with.
Managers seem to have a hard time understanding that hiring is expensive and its better to either pay someone well or find candidate willing to put up with their shit.
And for the key assessment - validating that the candidate can use brain, I had “Architects” who failed to google shit to solve easy things.
1
Not me but a guy I worked with was using laplink with a serial cable between some of the really old compaq portables, the huge ones that the keyboards folded into the screen. He copied the data the wrong direction. I always had the habit of keeping the destination to my right but yeah I get it, it is easy to get confused.
Me myself was trusting someone that it was "okay" to stack switches during work hours. That did not go well.
1
Not me looking around his office to try to find out if he has this or not already lol
1
Not me personally, but an on-site engineer was eating some snacks in a dc and left his snacks bag over the fan in the rack, which caused overheating and resulted in damaged equipment and of course an outage. Fortunately no fire.
1
Not only that, but the cost of migration, learning a new tool, new processes, new whatever... It'd probably take years to make net out the cost to switch over.
1
Not really sure why you are implying I'm being unkind.
Civo has been aware of this for over a month with no action. At this point if nobody calls them out, they likely wont fix it proactively. It will be resolved as a reactive issues after a big security vulnerability is found and their customers are compromised; resulting in data of yours being leaked because you're a consumer by proxy.
1
Not super-experienced with either, but the advantage I hear about most with Go is its ability to be easily packaged into a single binary and run from within a container environment in a leaner fashion than doing a bunch of `pip install yadda-yadda` in your Dockerfile.
1
Not sure how much experience you have but the Atlassian suite of tools, when used correctly like, any other tool, are far better than many others out there.
1
Not sure I totally grasp the question, but the challenge is designed to be difficult, but doable, for people who are juniors. Many people have used it to help land their first job in a cloud or DevOps technical role.
1
Not sure if this is an issue too, but you also cannot just 'use the internet' in AWS China in the same way as other regions (at least when my company looked at it). This means you need to use Direct Connect and MPLS from the local Chinese carriers.
1
Not sure if you are still following this post but I was wondering if there’s an AWS version for the Azure section.
1
not sure if you mean this, but you basically want to have a more humanely readable plan output that maps to resources in a different way?
1
Not sure to what extent it's relevant, I use ephemeral instances (VMs and containers) for CI/CD pipeline. They spin up when there's a job and shut down after the job is done. The on-prem servers running them auto suspend after a predefined idle time. The infrastructure is setup to use WoL to wake them up as needed. Adds a couple of seconds to build times but otherwise is transparent to the rest of the system. Well, the web UI sometimes shows timed out jobs, but the pipeline understands that this is a retry-able issue and just re-queues them until they actually run.
It looks like it saves some money on electricity, and there is less need for maintenance on the servers (e.g. dust). If you run cloud services rather than on-prem, you can probably save much more as the total costs scale more linearly.
The issues other comments mentioned are of course possible to happen. From my perspective it sounds like inadequate automation/testing.
I did try to run developer environments in the cloud (vs code server) but there wasn't a clear indicator of when the service is idle so I haven't figured out when to auto suspend/shutdown the VMs. Compared to the CI/CD pipeline, there it is easier to detect idle state (no VMs/containers running), and jobs have a clear start and finish. However, if implemented, that would definitely save money, as I said the costs scale more linearly with respect to uptime hours.
1
Not sure what problem you’re looking to solve. Why do you need rate limiting? Why wouldn’t you just add more dynos?
1
not sure what you mean. Can you be more specific?
1
Not sure who hurt you but Azure really isn't that bad to interact with.
1
not sure why i had to scroll so far to see this. so much more fun than python
1
Not sure why it was mod'ed. Chatted you link to blog.
1
Not surprised by this story. I always advocate that if you're going to use ES, it should never be the only place you store any single piece of data.
1
Not the op. But I'll give it a go.
Atlassian bit bucket (git) and bamboo (cicd) was designed and developed at the time when cloud and ephemeral targets were very few and far between.
Everything in their primary tooling and setup presumes static targets. They were not built or designed to integrate in a gitops mode, or to natively call many k8s clusters based on environmental promotion methods especially in multi branch pipelines that run robust tests and all collapse into main branch for deploy and promotion. No built in ways to easily define dynamic or child pipelines. Agents are statically configured, and no easy way to spin them up on demand with correct AWS role profile, for instance.
In a way they have similar weaknesses, imo, that Jenkins has, though Jenkins is more extendable.
Bamboo is a large installation that runs in a jvm with very specific requirements if you're doing local install.
No native advanced deploy methods: canary, blue/green.
It's been some years since I used it last so I don't remember every shortcoming of it, but if you're working in a more modern environment with containerized workloads, atlassian cicd is a system out of time.
They are trying to fix it but by adding more functionality on top of it, but the core is still not built for it, and it shows.
1
Not to deviate from the intended topic but I would not consider ECS to work “just as well” as kubernetes. It’s quite shit, in fact, IMO, all of the struggle, sluggishness and lack of visibility of elastic beanstalk just container-native.
It’s babies first Kubernetes, and I’m glad I don’t use it anymore.
1
Not understanding that there are different types of documentation and they require different approaches.
https://documentation.divio.com/
Divio had a really good write up on the 4 types of documentation. When you start to treat them separately it becomes easier to create higher quality docs.
We also found that tools like a Confluence are great at dumping text, but it’s hard to see the wood for the trees. Search will pull in irrelevant pages and you have to hope a project doesn’t use a generic codename.
1
Not unreasonable at all, just very simple.
1
Noted, appreciated. I will put more stress no this. This is the core of it indeed.
1
Noted, thanks!
1
Nothing on the internet should be talking to any of your infrastructure directly, for obvious security reasons. The public subnet is for load balancers, etc. Its a lot easier to enforce this type of set up to say "If it needs to be accessible from the internet, it needs to be on a load balancer and it needs to have certs installed" Private subnets can be a lot more open this way so that things inside the private subnet can talk to each other with broader security rules.
Its for a lot of the same reasons your home network is set up the same way. There is a single ingress point instead of hundreds or thousands that you have to manage.
1
Nothing. I like Vault a lot for many reasons, but you never know if there's another up-and-coming tool or library that has promise.
1
Now I don't feel so bad about it, ~ hr leakage territory though.
1
Now I'm not really sure.
Googling the issue, I'd expect some headers in resolv.conf indicating that it was set by cloud-init.
I have also tried to add name resolution block to cloud.cfg to try if that exactly goes to resolv.conf or not. It turned out, it doesn't. So now I'm not sure resolv.conf is managed by cloud-init.
It needs more investigation I'm afraid.
1
Obviously you don't want to do this by hand...
Add the following to your .gitlab-ci.yml:
```yaml
tag_image:
image: docker
script:
- docker login -u ${PRIVATE_REGISTRY_USER} -p ${PRIVATE_REGISTRY_PW} ${PRIVATE_REGISTRY_URL}
- docker pull ${PULL_IMAGE}
- docker tag ${PULL_IMAGE} ${TAG_IMAGE}
- docker login -u gitlab-ci-token -p ${CI_JOB_TOKEN} ${CI_REGISTRY}
- docker push ${TAG_IMAGE}
```
1
of course you can write a script for it. if you can describe what you want to happen accurately why couldn't you make a script for it?
1
Of course, starting from scratch is always harder than using something premade, but then it's not a fair comparison. It's like asking "is cooking dinner really that easy?" when preparing something from Hello Fresh in a fully kitted out kitchen and with staff doing chores, vs having an empty room, a credit card and a shopping list.
1
Of course. You'll find some kind of cargo culting everywhere. Probably you're to some extent guilty of it yourself. So am I, I'm sure.
Usually it stems from not being able to research and figuring out everything yourself. So you use the heuristic of just copying other seemingly successful people and companies.
1
Of the two I do not recommend Bitbucket. GitHub does everything better except for Jira integration.
1
Often documentation isn’t included in company practice, so nobody actually owns it and you see all the common problems most of us are familiar with.
Make curating, organizing, updating documentation someone’s deliverable, and it might get done sanely.
1
Oh
It is true
Thanks
1
Oh dear, this sounds like an xy problem gone mad. Fix flask and websockets you crazy fool. Ask for help with the real problem.
1
Oh fuck off. If someone cant do their job, they dont get the title. They need to work on that journey like everyone else.
We fix and engineer systems that monitor themselves.
There is no “Junior heart surgeon” is there bud
1
Oh God. . . this fucking answer. Gatekeeping ftw.
1
Oh I agree with you.
1
Oh let's see, in my 25+ year career I have:
* Unplugged the wrong network cable from prod ERP system bringing the entire campus HARD down during peak registration. Took over 2 hours to bring the DB back to a consistent state after a network "outage" like that
* Deleted some files in `c:\temp` directory that were not TEMP files - she had just put them there *temporarily*. They were a series of wav files recorded from her trip to Europe to interview a person that had now passed on and were unrecoverable.
* Dropped the wrong DB from correct environment (more than once)
* Dropped the correct DB from wrong environment (more than once)
* Called my boss some choice words (more than twice)
* Running load test against PROD during the day (before we rearchitected to handle the load)
* Broke IAM permissions to S3 bucket so AWS had to reset them to default
* Spun up >10 M5 class EC2 instances running a terraform test and forgot to run `terraform destroy` when I was done (took a billing cycle to find that one - now we have alarms on that!)
SO many more, I could go on for days! Luckily I have so many more wins, otherwise, looking at that list you'd think I would have got fired for any of them. But I've never been fired and only been promoted. Now I'm an SRE making $150K+/year
1
oh now I see, you are DevSecOps. You definitely need to keep learning more than your job demands.
From Personal Experience. (Trying to give relevant example for DevSecOps.) I am sure you are already following many of my below suggestions. If you are then Kudos you are on the path to learning :-)
**Carve out an hour daily (including weekends :-) ) to focus on any one of the below.** Keep switching the focus area either weekly or daily depending on your comfort level.
1) Follow these sites to learn about the latest developments in cybersecurity
* [https://www.darkreading.com/Default.asp](https://www.darkreading.com/Default.asp)
* [https://nakedsecurity.sophos.com/](https://nakedsecurity.sophos.com/)
* [https://thehackernews.com/](https://thehackernews.com/)
* [https://www.csoonline.com/](https://www.csoonline.com/) (my fav)
2) If you are responsible for (example) Java related DevSecOps, periodically go through this [https://owasp.org/www-project-security-knowledge-framework/](https://owasp.org/www-project-security-knowledge-framework/)
Best source of information in my personal experience.
3) If your organization has a choice of tool for scanning code for vulnerabilities, then
a) keep up with the latest features and learn about them at the earliest
b) Pick two immediate competitors of this tool and learn the latest features they offer.
These 3 steps should keep you current with almost 90% of the latest developments in the security and DevSecOps area.
1
Oh so one i time worked Rogers……
1
Oh sure. I'd imagine I'm looking at a bunch of lambdas for communication between the frontend and backend. And I'm still learning HTML/CSS/JS... And it's not really a fully functional web application, since it would require authentication, security, multi-tenancy, and so on.
1
Oh yea, it's a terrible idea and I wouldn't recommend it for anything but a learning experience.
Take a look at what t[his guy has done](https://github.com/bluxmit/alnoda-workspaces) with Docker. I spent some time modifying his stuff to work for me and also learned a lot about building Docker images.
His [dockerfile](https://github.com/bluxmit/alnoda-workspaces/blob/main/workspaces/ubuntu-workspace/Dockerfile-20.04) for a basic ubuntu container can give you a great starting point.
1
Oh yeah, definitely. I hadn't necessarily thought of it being an actual application to be used by anyone, more of just a "go to this link and play around with it, then check out this Github and look at the pretty code" sort of project. So it wouldn't really need multi-tenancy or authentication for individual users.
Side note - SSM for auth? Did you mean something like Cognito?
1
Oh yes. Definitely a topic that isn't discussed almost daily here.
1
Oh, and to your question: The biggest frustration is that it is not read, not updated, not anything... and people don't view it as the brains of the company and still ask around for solutions to everyone.
1
Oh, I see what you're getting at. So 1.22.9 is 3 months old compared to 1.22.2 being somewhere near like 9. You can work around a lot of problems with the networking control you have in AWS... but there's no excuse for that. Getting a k8s and k3s build out can literally be automated. The CNCF provides a script for it and the submission steps can be scripted.
1
Oh, Im sorry, I think my context was off. Im not expecting to become DevOps first day. I would gladly take an entry level job as anything really.
1
Oh, of course its Spring/Maven :D
Alright. You will have to make a choice here. What Maven goals do is nice and all when you have one instance running, when downtime during deployment is okay or when you are developing locally and using the maven goals to keep the DB up to date.
This all flies out of the door once deployments to the backend and multiple instances are involved. The correct way to do it is roughly the following:
1. Get rid of the maven plugin or restrict it so it only runs in local dev env
2. Check in the SQL scripts and create deployment artifacts out of them
3. Run flyway as the first step of a deployment, if it succeeds continue
4. Deploy the new version of the application
5. Grab a coffee and enjoy your deployment
&#x200B;
I hope that this is starting to make sense. You simply cannot have multiple processes trying to run schema migraitons on the database at the same time. All but one will most probably fail. What is the app supposed to do then, shrug and assume that the DB is in the desired state? Fail, restart, try to migrate again? Hopefully by then the DB is in the correct state, but this deployment process sounds really flaky and there is absolutely no reason to do it that way.
When you run migrations in the deployment pipeline you ensure two things:
1. The database will be in the correct state prior to the web app starting
2. If the migration fails for any reason, the rollout of the new version will not be performed and
3. actually one more - you will potentially have zero-downtime deployments
&#x200B;
This all assumes that your migrations are backwards-compatible with the previous version. As the old instances are still running on the cluster (I will assume k8s), the schema changes but since there are no breaking changes, they can continue to operate. Btw this is not a restriction of the 'migrate during deployment' flow, you have the same problem when you start replacing old instances with the latest version and they try to change the schema while old instances are still running.
1
Oh... Yeah I think *this* is my real issue. The index.html file doesn't need to be in the repo at all, it just needs to be deployed. Thanks so much.
1
Ohh nice care to share which certs? I’m working my way up to DevOps.
And it seems that you are already in Cloud, good luck with the job hunt.
1
Ok
1
Ok maybe I could tell them to contact me if I don't show up.
Taking days off it's not really possible, I was thinking at asking them to allow me to work remote when this happens but they have a weird company policy and only in half a year I may be allowed. Other people in the team work in remote.
I think I should be more careful with my sleep and this shouldn't happen.
1
Ok, this was a bad example.
1
OK, wow I legit expected GKE to be the poster child of "keeping up to date". Interesting, I always assumed that what MS is doing is the industry standard practice. As for recent AKS experiences, so far the only thing I occasionally had to deal with was node pool upgrades which get stuck for almost an hour, then magically continue and finish quickly in the matter of minutes. It's super annoying and I haven't found a reason why this might be happening. But I guess that's a fair price to pay for zero downtime updates :D
1
Ok; a few people have mentioned it already, but in your position I can't scream ansible loud enough.
Ansible is not the best tool for the job it does, but it is as easy as possible for a manually-ssh'ing-sysadmin to migrate to.
To be honest I'd almost forget about everything else at this point; docker, k8s, etc for now until such point that you have a solid ansible setup.
What do you use ssh for? See about making it an ad-hoc ansible command. Once you've done that make it an ad-hoc ansible command that you run on multiple hosts, and once you've gotten that done, wrap it all into an ansible playbook. Every time you find yourself ssh'ing into a machine (particularly a prod one) ask yourself if this could be done with ansible.
Once you're comfortable with ansible, and your machines are cattle and not pets, the next step I'd recommend would be monitoring. Ideally the correct answer when someone calls you telling you there's something wrong, is that your monitoring system already detected it and you're working on a fix.
1
Okay I've just started with web dev right now. Gotta learn a lot of stuff now!
1
Okay it all sounds great then thanks I'll take a look and see how I get on, I had a feeling what I was doing was probably convoluted and ridiculous sometimes you can't see the wood for the trees haha
1
Okay perfect, I'm trying to set this all up right. I'm hiring a software development firm and mapping everything out in Jira to try and make sure I'm specific as possible and get everything I need/they can't argue about scope changes or something if it's all mapped out. Someone said it makes sense to use BitBucket over Github due to native integrations.
1
Okay, I'm only going by the docs where it explicitly says you can't do multi-machine agent setups with the OSS version. (Agent being the older term for runners.) And when I built from source last year I found this to be true.
You can absolutely download the official binary or docker image and use that in a multi-node capacity but that's not the OSS version. I believe you need to compile for OSS specifically. I used drone in this capacity before purchasing a license.
When you're using the non-OSS version you're subject to potential limitations, but it otherwise behaves in a full-featured capacity.
The Enterprise FAQ plays it out pretty clearly:
https://docs.drone.io/enterprise/
1
Okay. You mean to put allowed urls in authorization policy list. Then what is the requirement for serviceEntry. Sorry, I am new to Istio.
1
Old habits die hard, but it’s been probably 12 years since I’ve used a windows machine at work.
1
Old versions are super common for stability. Go look at what the default version for managed k8s from any cloud provider is. It will be anywhere from 1-3 versions behind current, sometimes more.
1
On k8s we have a wildcard A record pointed at our AKS cluster and a wildcard cert. any new services can just use whatever fqdn they want, as long as it’s behind the original wildcard or a path.
1
One authpolicy to deny traffic and one to allow. Either based on source as SA or NA etc.
ServiceEntry you add just to allow the mesh to know about the AWS URL.
1
One interview question that i ask people is:
Let's say you have a code repo in github, a simple hello world application in whatever language you have any experience in working with (not necessarily programming, but even able to look at and it makes sense), and you need to build a pipeline which gets that into production in your cloud vendor of choice. Off the top of your head, talk me through what you would do to build that pipeline.
DevOps is about optimizing flow of value into the market, and in the simplest ground level breaking into the industry terms, it is doing exactly what i ask in interviews above. Now, there is NO right answer to that question because the industry is so diverse at this point that there are O(N) ways to pull that off. I will usually hire even the most Jr people if they can fumble their way through answering that question with what they believe is the right way to do that.
I think where people get hung up in the conversation is really around things like:
* Do we use ansible or chef?
* CloudFormations (or ARM for you Azure folks) or Terraform
* Go or Powershell
* Kubernetes or OSO
* Tool / language or Other Tool/Language
Dont get me wrong, knowing tools is for sure a plus, but if someone shows up for an interview and doesnt understand the actual intent of DevOps, then the personal narrative shift is going to take a lot longer to coach and mentor through than simple tool skill development.
1
One of my colleagues tried to push Terragrunt quite hard. We have a relatively small environment ( 6 AWS Accounts with some generic VPC/EC2 workload) and frankly I don't see a point of using a wrapper like Terragrunt for such small environment. The features that are still not available in Terraform that it provides are mostly geared towards larger environment with complex needs. I get it when you have a lot of resources to manage and need to have a well tuned way of ensuring your Terraform codebase is not getting out of control but I am not convinced the switch is worth for us.
1
One of my ideas lately was around zero downtime deployments. I want to update my frontend and backend with minimal impact to users. So I thought why not create the web serving layer as being event based if you have a SPA you can have a protocol that isn't rest but enqueuing of an event ala event sourcing and (it would be good for redux design) and then you don't upgrade that component because all that component does is enqueue events to a queue. I can therefore upgrade the SPA and the backend systems that talk to the queue without worrying of updating the serving layer. I can even upgrade the frontend before upgrading the backend or vice versa. I don't have a name for it but splitting your architecture that way gives you a lot of flexibility in deployments. The message format would be version independent - the client would ignore or cache events it doesn't understand and so would the backend.
1
One of the benefits that CDK brings is that your IaC can be configured in the language your team use. For example, if your application is written in Typescript, your IaC also can be.
1
One of the issues CircleCI shares with Gitlab is Cost. So I don't think Circle CI would be a good alternative.
1
One service is a python web app which loads and hooks into a sqlite db file on disk. I could containerise it, putting that file on a persistent volume, but I would have to know it won't ever be scaled the way it is. At least I would gain something I believe from containerising it but not a huge amount.
I'll be figuring this out on a case by case basis, and the goal would be to containerise everything across bare metal. But likely there would be the odd few which would run on VMs like they currently are.
1
One word. Homelab.
1
Only if the resources already exist and are present in the state
2
Ooay so you would infact use jenkins? Because you probably have a long history with it? But as a newcomer to Jenkins, is it still reccomendable?
1
Oof yea, one predecessor had a thing for Ruby so we've got odd patches of that I come across.
1
Oooo love this idea!!
1
Op - if you want it was, look for kits instead of just the base models - you might be discarding a SD card or not get the case you want, but if you really want it on a deadline, there are kits around.
1
Op describes a problem where the state of data in the db causes issues, it’s not uncommon for a record to get in a bad state where your business logic cannot correct it, yes it’s a bug, yes there is a deeper issue that needs fixing. But for now what you normally need to do is run a query targeting the exact record that has gone wrong.
Perhaps you could have a preprod env like you describe. For me it feels more pragmatic to address the root causes and run the query. Build an admin panel.
What i think is unpragmatic is building process around what should be short lived issues.
1
OP: while far from ideal, you CAN create a docker image from an existing Linux server. I don’t know if you should, but it might also help you with peace of mind and creating a Dev environment.
Just make it a temporary fix until you build a proper container:)
1
Ops guys who can use IaC tools to provision cloud infrastructure for the Devs
Or Dev guys who can use IaC tools to provision cloud infrastructure for themselves
Is that what DevOps actually is? Don't know, don't care. It's what companies often think DevOps is and is how they hire accordingly
1
Opted out all our several million customers from receiving all emails (including order confirmations). Took like 2 weeks to fix.
1
Or "that's how it is done" or when you do smth in a different way (even though your solution is better) they just say they don't like it because "they didn't see smth like this before"
1
Or `tmpwatch`, which was made for this purpose.
1
Or better yet, don't set any limits on compressible ressources :)
See -> [Stop using CPU Limits](https://home.robusta.dev/blog/stop-using-cpu-limits/)
1
Or don't use any Jenkins plugins.
https://github.com/DontShaveTheYak/jenkins-std-lib
My library has some really cool features like being able to run GitHub actions but doesn't use any plugins.
1
Or is possible to get the same functionality using free version?
1
Or the original M135i/135i - I used to have an M235i before the M2
1
Others have said it but if your company is really balking at a 2k per year expense then management is clueless. Not having effective CICD costs way more in lost developer time. Switching to another CICD provider will cost more in developer time.
Well if you are forced to switch avoid bitbucket and circleci. Easily the worst two CICD platforms I've ever had the misery of using. Self-hosted jenkins can be fine but is fallout out of favor due to its bugs and stability issues. I'd suggest just going with Github Actions, it covers most use cases these days and is fairly reasonably priced.
1
Our company has to meet SOX compliance as well. SOX is loosely defined (to my surprise, when I looked into it). Regarding the pipelines, it basically states that you must be able to trace and identify who made which modifications and no modification can be done by a single person.
On our scenario this was translated to pipeline approvals to the SOX enforced environment.
The apps codes are tagged and automation fires the build and deploy pipeline. With the image ready, the app is deployed after two people, one from the dev team and other from infra/devops/operations team (whoever will be among the first responders if there's a problem). Only then the deploy is done.
It's also necessary to keep the logs regarding these deploys+approvals but I don't know for how long.
If we need to execute some activity not covered by a pipeline (emergency fix or whatever), we create a ticket on Jira describing the intended actions and another automation generates a Google Meets link we use to record everything that's done. Again, the person approving the change must not be the person executing it and the recording must be kept for auditing purposes.
I hope these very generic information helps...
1
Our Dev Stacks have a shutdown time on them by default (six or eight hours after startup). There's a web interfaces that allows them to turn on their stack (self service). This has reduced our operating costs significantly and there really hasn't been many resources expended on the overall functionality.
Our biggest issue with this is that the Dev Users fail to start their stacks and just ... panic when they get an error. Even though there's been a bunch of training and hand holding previously.
The Ops team can extend it past that initial time up to 7 days for special requests.
1
Out dated documentation. Too often a change is made in how something behaves or the interface is restructured, and the out of band (confluence) documentation remains untouched.
1
Out of interest, what do you think triggered the change?
1
outright R53 records in CF LOL !
1
Over hosted stuff, certainly. If I were picking something to learn, I would compare it with GitLab before making a choice. Jenkins is powerful, but unwieldy, as it requires understanding so many things to make it do what you want, and you can usually do them in too many completely incompatible ways. Updates can be a pain too, particularly if you don't do them frequently. If you're learning it, always remember that the system log is your friend and will almost always point you to where any issues are.
1
Overwrote `/dev/null` with an actual file on all servers, including production, causing all of them to slowly die until restarted. On a Friday evening.
1
Parkmycloud.com
1
Patching, OS maintenance, the underlying virtualization platform, underlying storage platform, networking, blah blah blah blah. I think you're burying the lede a tad, no? :)
1
Pay the OOS creator to fix it for you ? Thats what quite a few companies do.
If your whole business is based on someone’s else work -> maybe, just maybe its the time to hire or pay that person.
1
People not following it. The good ol': *I know how this is done!* and then breaking something because they didn't knew they had to do something before or after a change.
1
People not reading it or generally not knowing about it. Even when you give them a link to the page they ask questions that are clearly on the page but they were too lazy to read it.
1
Perhaps in your case just change the annotations on your pod to.
annotations:
proxy.istio.io/config: |
outboundTrafficPolicy:
mode: REGISTRY_ONLY
This will deny outbound traffic to anywhere that is not defined in the mesh. And then simply just att service entries that are exported globally or to the specific namespace for the services your want the pod to be able to access.
So in that case you don't need an AuthPolicy nor a Service Account, basically.
And if you need to deny access to other mesh things you do it with the AuthPolicies as always.
Hope that works for you.
1
Perhaps you meant "Infrastructure as code"? Data is what the apps store and manipulate.
Any "cloud" service is in essence infrastructure created via code. One could argue the same is true of on-prem VMs and containers if deployments are scripted.
1
Personally, I like a standard. In other words, I don't want to come work at a company and hear "oh the last guy liked Pulomi and everything is written in C# soooo...". Having IaC written in whatever feels most comfortable to the engineer who happens to be there that year is not a good business continuity strategy.
Yes companies adopt standards that their apps/IaC are developed on but what I'm talking about 100% happens at a ton of small companies. If my company wanted to adopt Pulomi, we would 100% use YAML because everyone knows it and the chances that the next engineer behind you can pick up what you just laid down is extremely high vs. any other alternative.
1
Physical layer issue, but I cut the uplink to 4 bonded T1s when we were demolishing a building. Entire business has no internet for 48 hours. Luckily it was the Wednesday before Thanksgiving and the ISP got it fixed by Friday. 😅
1
Pipelines are also code, which should be in version control.
1
Please connect to this IP address via terminal, password is XYZ.
Then:
- please tell me what do you think is currently wrong with this system ? (ppl check memory/cpu usage, docker containers running, rarely anyone checks if there are zombie processes or whether system is under heavy load that is not CPU/MEM bound)
---
Other:
- Under /tmp there is a simple bash script that doesn't seem to do what it's supposed to do. Could you explain to me what do you think it does and why it does not work as expected ?
(very simple data parsing with issue in flag usage or premature termination)
You also have full access to the web.
You don't need HARD problems to see how someone thinks. Crappy dev would throw a towel or do a round trip about anything.
1
Please don’t.
Go for an entry level job instead. Learn to code, learn different tools, be familiar with Linux and networking. That might take a few years, you then consider DevOps.
1
Please listen to this ^^
1
Please see the pinned monthly thread if you wish to know how to break into the industry or have questions along those lines.
1
Please stop saying raid array it's redundant and makes you sound like you don't know what you're talking about it's like hearing people say nic card
1
Please take my thoughts with a huge grain of salt, but I thought I'd share some takeaways I've had:
- first, agile is different than scrum. Scrum is a specific implementation of agile principles, which is to say you can be agile but deviate from scrum
- in my opinion, all formal interactions should exist to clear barriers for your people, or keep the team's work strategically grounded with the company's principles. The product manager should really be in charge of that piece.
- Agile does not change code standards and other sound engineering principles. Agile is opinionated as to how you take risk. If you push risk acceptance for code quality/design choice too high too often, you might as well not use agile. That's not to say there isn't a time to push things up, but you should be very careful to minimize the the need to do so.
- an agile environment is normally self empowering for team members. But not all team members are created equal. You may decide that agile isn't for your team because you don't trust them (to either produce or be technically competent), but I'd recommend seeing how a few months to with the freedom that comes with agile before judging.
All this being said, tools to do this are everywhere. We use Clickup as an all purpose task management tool and it has worked well. They have some helpful automations/views for sprint programming.
Otherwise Microsoft Planner, Trello, or literally an excel sheet for kanban
I come from a strange background, so perhaps I'm off bad from other engineering perspectives. I hope it works out for you!
1
Plugged in a wrong switch as network engineer with wrong VTP information in an ISP. Took down a city and about 50k businesses for 30 min. 😀 no, didnt get fired.
1
Plus if you're otherwise successful but failing due to scaling someone will buy you. Scaling is in the class of rare problems that throwing money at will almost certainly fix it short term.
1
Point!
1
Pointed my empty development Artifactory to production as a replica. Replication worked and all the packages were purged!
1
Post examples of your code. It will help to see where it goes wrong
1
posted security keys to a public GitHub repo
1
PowerShell and C# / .Net
1
PowerShell gets a lot of hate but I’m hella productive in powershell core
1
Powershell only on Windows or also on Linux as well?
2
PowerShell, Bash, and I'm adding Python into the mix.
1
Powershell. Its my favorite. Started out with bash and python too.
1
Pretty much any of them?
Even docker itself does it.
1
Pretty much the same for me, a DBA guy helped me recover the table. Still appreciate the dude, lol.
1
Pretty simple to be honest, we use a dns management repository for deploying our zones and we have set up terraform to be able to to dynamic deploy these zones to various project/env accounts so developers can manage their own subdomains for their projects.
It’s important to have some sort of scanning a checking in place to make sure domains are correctly configured.
It works pretty well to be honest.
1
Primary argument is simplicity. Not everyone needs those extra complex orchestration features and Nomad, as well as ECS, solve for the primary feature of taking a Dockerfile file and getting it up and running in a scheduler as smoothly as possible.
1
private vs public subnets are mostly a matter of attack surface reduction. It's more the philosophy of "Allow what I need" (private) vs "Deny what I don't need" (public)
In a public subnet, you'll have to protect every possible access, and any mistake protecting it (by changing SGs or FW configs) / failing to update your server to protect your resources, may cost you the integrity of your server or your infra.
In a private subnet, by default, nothing is reachable from outside, so you'll have to have something making it possible (a frontend of any sort). It reduces your attack surface and you can more easily protect it.
A correctly protected public subnet is not (imho) less secure than a private network, it's just more work, and it will most likely happen at some point that you failed to protect it correctly.
Rule of thumb, don't put something that doesn't need to be reached directly (like databases) in a public network.
1
Probably it was a typo updating the direct zones in their DNS. The mail should be a reminder, not a request, they may not be aware of that is happening. And you can also ignore that, it is some extra traffic but it should be minimal compared with the amount of exploit finders that are in the net.
About exploiting their mistakes, it is something that you wouldn’t like that it happens to you. Besides ending on some obscure blacklist or be taken into account in some dystopian future.
1
Probably little more than a javascript app and an exposed docker daemon :D
1
Probably not the biggest mistake, but the most effect per command: Learning that the `killall` command pretty much does, what is says on the tin: kill ALL (processes)..
Back then, I only knew the Linux variant which is basically the `kill` command with an integrated process name search :)
I fired the command, I wondered why my terminal froze up, gave it another few seconds of thought and right when I stood up from my chair to run to the server to see what's going on, the department lead called and asked why his server was down and his people couldn't work :D
&#x200B;
And another time, I was at a customer's data center, migrating SAP systems in bulk from their old hardware to our new hardware.. Source and target architecture where pretty much the same, so we copied the operating system and software as a system restore "tape" image and the data by mirroring disks from the old SAN to the new SAN.
Problem was: I had a 4-5 hour train ride there and because of the tight schedule and the time it took to copy all the data, I got up at 4am on Mondays and rode down there for the week and we migrated the first systems at around 10-11am..
And you probably already guessed it: One day, I fumbled on the copying of the data and mirrored the shiny, new, but empty disks onto the old ones with the data :D
Luckily, they had a very good backup system in place and things weren't toooo bad. After that, we worked out a checklist that would attach the disks in a certain order, so that the device names would always be the same and the filesystems would be mounted, so we couldn't simply mirror to the devices, because they were already in use, but we could attach another mirror to the existing device.
1
Problem is that simply reindexing on pb scale takes days, even if you throw ridiculously large sums of money on the problem…
1
PROLOG. Skip the JSON as logic nonsense.
As prolog would say. Yes.
1
Prometheus uses LevelDB, tho maybe someone could expain to be why it is a bad idea for this usecase ?
1
Prometheus/Grafana aren't about logs, they are about real time monitoring of any data set you want (CPU, Memory, disk space, load average etc). They would also make a good example of the earlier Ansible project I gave you - installing prometheus on all of the VMs you create as part of a standard build.
I don't have any good answers for the logs part, I've heard of graylog. You may want to check in r/selfhosted as those guys do quite a bit.
1
Pulumi and here is a detailed article I wrote that explains why: https://www.techwatching.dev/posts/pulumi-vs-terraform
1
Pulumi and it's not even close. Some of the few points someone already posted it here.
https://www.techwatching.dev/posts/pulumi-vs-terraform.
1
Pulumi can easily cross-translate it’s IaC from one language to another, and it does support a very, very common standard: Typescript.
There’s also YAML and the REST API, which are arguably both pretty standard as well.
1
Pulumi doesn't have an equivalent to Terraform cloud afaik
1
Pulumi for now, I'm meaning to dig into tfcdk.
1
Pulumi is declarative.
1
Pulumi was declarative even before they added YAML.
Just because you used typescript or Python or any other of the languages didn’t make the whole thing imperative, it would still reorder (within the language boundaries) to create resources in the right order
1
Pulumi. I like the power of general purpose language.
1
Put your builds in a S3 or GCS bucket instead, and put a lifecycle policy on the bucket to automatically delete builds after X days.
1
Python
1
python
bash
1
Python & PowerShell are my go to
1
Python, bash, powershell, and when I need quicker and dirtier than python, I'll throw a PHP script up.
1
Python, bash, Powershell, PowerCLI
Data serialisation languages as well
1
Python, bash, powershell. Yaml.
1
Python, Go, Bash, PowerShell, HCL
1
Python, PowerShell, ymal .
1
raid 5.
Backups weren't tested and we lost some data.
1
Ran a vendor recommended “no impact” command on prod that took a major major system down for 30 minutes costing the company some real cash.
1
Ran terraform destroy with an AWS IAM policy attached (idk what I was thinking) but that nuked every resource in the account and couldn’t bring it back up.
1
Rancher seems to require 4gb Ram machines or larger to be added, I think that's half if my issue (since I am trying to add older hardware to the cluster). The other half seems that I don't know which networking drivers to choose for different situations and other driver info.
1
Read it for a uni course, great for sparking interest and ideals to strive for but it is fiction. I liked it.
1
Read The Mythical Man Month instead. Sure TMM doesn't cover DevOps and cloud and blah blah but it's largely about the same concepts delivered without cringy writing and shitty analogies and metaphors.
1
Realistically speaking, this is a senior-level FTE's worth of work. Its going to require both a lot of policy involvement, and likely a significant budget uplift in subscription fees from you.
Its understandable that you want to see what you might be able to crowdsource and implement on the cheap, but you won't be able to fully implement or maintain "enough assurance to sleep well at night" on your own.
1
Really depends on what’s easy to deploy. I run some eks, not for scaling or all the features k8s has, but because the thing I rub comes as a helm chart, and this is the easy option.
I also have some docker swarm mode and some regular docker compose, because those are the easy option for the things they run.
1
Really don't understand why, on a specific question.about "programming language" several people answers with "yaml". Porcodio.
1
Really great advice here. OP is right, Ansible and Docker don’t really go together, but think of Docker as phase 2, you just aren’t ready yet.
BUT, you could use docker (and docker-compose) to quickly spin up a dozen light weight Linux boxes, and use Ansible to configure them all :)
Vagrant would be another option, and I think your best option. Assuming you at least have an ESX server to create infrastructure on, tou could work on using Vagrant to spin up the Dev environment you want, and Ansible to help configure everything.
That gets you your Dev env, great training, and peace at night as you’re a good deal closer to automated recovery.
Also, I haven’t seen you mention git at all. If you aren’t using git, make sure you start.
1
Recently accepted a position as a T1 support engineer for a very large and prominent private cloud infrastructure company. What technologies would you have someone practice with in order to shift that position into DevOps? I know SaltStack and python are used internally, but outside of that I’m pretty new to the DevOps role/world. Mostly also because I’ve only been in the IT space for 2 years haha
1
Recently did this heh, though am self employed sooooo yeah.
But seriously I've been automating computers for so long I'm amazed how long it took before we thought to automate housework. Roombas are amazing.
1
refactoring means different code, but same results. So you write tests that expect a certain behavior and provide as many input/output scenarios as warranted. Then you rewrite the code and run that test, and get the same answers. If you are able to do it where you write more performant code or cleaned it up, you have a successful refactor.
1
Remember when the US-East-1 had the big s3 outage due to an engineer screwing up a node and causing a cascading failure a few years back?
Yea I wonder if they still have a job.
Story time for me:
As for me, I was once convinced when I had plugged in admin port/management interface of a supermicro nexenta I had taken down the network.
We had just closed down a datacenter and got our equipment shipped back to our home office and I being the internal systems engineer/sysadmin at the time, I thought sweet, they are just going to throw this out, I'll use it for spare parts of my current arrays and another I'll use to replace other older nexentas.
I rack the two controllers and the two jbods. And cable everything together except for the ae storage vlan network connections. I only wanted management so I can wipe it and configure it without worry of anything going wrong. As soon as plugged into the ipmi/management interface on our network/server management vlan, the entire company went down.
1
Remote replication. And in 2022, it's really not that much data.
1
Removed a target group that had 0 targets. The way CI worked we, at one point we checked for that target groups existence, so even though it didn't do anything, it stopped the pipeline. Didn't take down prod but took down CI for an hour.
1
Right - for an enterprise application, visibility and transparency are extremely important. Maybe setting up some alerts in CW with SNS notifications would be a good addition. I can probably scope it so that it doesn't cost me a bundle.
1
Right - you mean setting up the integrations on either end? Like, sending to datadog etc. and integrating with common ticketing platforms?
1
Right on. What’s the use case? Just curious about doing it with all open source stuff?
1
Right, but kubernetes is a *hugely* complex system, and while in theory you don’t care about this stuff, in practice you do to some degree, and you still have to plumb all the services etc together properly. It’s an amazing position to be in when you have a bunch of shit hot kubernetes engineers to manage it because your needs justify that kind of team and requirements. It’s just not worth it for most simpler requirements, and the complexity leads to increased risk of security issues and all sorts of other things.
Depends what you need out of your logs. For self hosted, ELK is pretty solid. I’m seeing more orgs push towards things like datadog and other hosted solutions, though, so less familiar with the self hosted space.
1
Right, thankyou! That deployment branch solution makes sense but I don't think it's something I can realistically implement alongside other features of our process; we have an environment per branch so splitting each of those in two would result in a lot of difficulty, I think.
Good to know that this is a logistics problem. I was imagining I could just circumvent the issue with some clever use of git.
I feel like that merge queue concept might be what I need. The branch in question can \*only\* be committed to via Pull Request, so if I can just queue up processing of those then I shouldn't get any collisions.
1
Right! I didn't think of this--the fact that the space for the hardware has already been actioned off and so those with the largest pocket books got the monopoly on these limited spaces...
1
Right. Yeah; it'll probably cost more than that (and need more cores) for influx.
1
rm * as root on /. Surprisingly easy to recover from on Solaris. Just restore all the links to /bin and so on. Just don't log out.
1
Robocopy is a fantastic tool, especially because it will do *exactly* what you tell it to do, even if that something is blowing off all of your appendages.
1
Rogers probably
1
Rolled out a master node change for a petabyte scale Elasticsearch cluster. Rollout went bad and the new master nodes lost all state and then wiped the entire cluster in less than 5 minutes. Our entire team worked 24/7 for 10 days to restore backups and replay data, somehow we got *everything* back.
1
Ruby
1
Ruby, Bash
1
Run your own mail infrastructure, including a web frontend.
Make it work on a cheap 5 bucks a month VPS.
Use that as your primary email for everything
1
Same guy in chicago told me all roles are remote if wanted.
1
Same here, but I do think there is so much theory potential there. Even the study of static analysis is worth a lot. I wouldn't make it mandatory, but something students can choose and get some good training
1
Saw this kind of cargo cult in every single startup I worked for. If something worked for Google doesn't mean it will work for your vegan dating site. And FAANG produce shitty software just like anyone else, still have nightmares about Apple's in-app payment system.
1
Scale the system so it can handle new customers, this is either by increasing kafka partitions, horizontal/vertical scaling based on what is happening at the time, automating literally any manual process, adding observability into the system and collaborating with developers on the daily.
SRE's are the feedback loop of the life of the system you're owning. The jack of all trades that knows how to fix most problems, but also has the connections to the right SMEs of the system to gleen knowledge for them and spread that knowledge across the rest of the team
1
Screen shot on the left and text on the right , booya
1
Seconding the above.
K8s is great fun to use, but if you're currently hand-rolling VMs on bare metal then it's going to be too much complexity, and it will be too hard to see the benefits.
Plus k8s isn't really secure by default, as stated.
You don't need to be a "shit hot kubernetes engineer" to use it, but it's something to consider when you've already got everything in containers and:
* There are 10s or 100s of them and the scheduling of them all is becoming a headache.
* You *need* autoscaling (which you likely don't if you're on prem)
* You *need* your apps to come back up if the host fails (which they very rarely do on their own anyway)
Remember that k8s was created by Google to solve the problem of managing hundreds of globally distributed services with incredibly tight SLOs, and that most envs are actually fine with just moving to docker-compose or vagrant.
There are additional benefits with k8s to do with shifting infra config left so that the devs can get more involved in it but if the devs aren't interested then it's all moot anyway.
1
Security in depth
1
Security+ is more practical and hands-on material.
1
Self-Hosted Gitlab + Kubernetes + Helm + Podman = Life
Ansible + Terraform to spin up and configure Kubernetes Clusters.
1
Sent a PM to ya!
1
Serious question. What interests you about DevOps?
1
Seriously people give jenkins a lot of crap but it's definitely the best documented and most reliable unless it's seriously way overloaded
1
Serverless can be done DIY pretty cheap (esp if you go with a budget cloud provider like Digital Ocean or Hetzner. On a Kubernetes cluster, you can install knative or openFaaS. You won't need to rewrite Lambdas in that case, afaik. And you can quickly spin up/down a cluster with Terraform.
As far as speed of Kubernetes vs a VM, k8s starts a container in about the same time that the application inside that container would start on the machine it is running on (whether that machine be a virtual or physical Kubernetes worker node). I'd say with Kubernetes, the advantage lies in the configurability of control and feedback loops. It's an operations team in an API. I've seen people lift and shift some garbo monoliths onto k8s, and \*surprise surprise\* they take a long time to spin up. Developers taking the time to think about how their app works conceptually, and break down processes into smaller scalable pieces...that's where you get the performance benefit out of Kubernetes. Without this decomposition, you may as well orchestrate workloads onto VMs with Hashicorp's Nomad (not a bad choice either).
There's no true scale from zero imo, as you still need to have the patchwork of stuff that connects the dots between people and resources (such as identity and access management), and that stuff has to have live data served from somewhere. But there's certainly a lot of work that can be done to reduce the up-front resources needed to host an application.
The task queueing idea is cool...event-driven architecture. You'd have to run something to queue messages/stream data, like Kafka. Build the mobile app with a client AND server, and have it process other requests in the background from the message queue and persist the data in another system :-p
1
Set up a homelab for this stuff IMO. Even just on a small digitalocean VM or something. There's no better way than to learn from experience, especially since there are so many components involved in a monitoring stack.
1
Short answer is yes. It's that simple. To make it properly a few more services can be added.
Why is it wrong to ask for guidance? Especially when a working solution is provided with the question?
1
Should be fine.
> Safe for multiple nodes in parallel
https://flywaydb.org/download
1
Shut down all chip manufacturing (actually everything at that site) for a very large, very well known CPU manufacturer for several hours.
The neat thing about networks is that they touch everything so when you take them down everything stops.
1
Sick, thank you!
1
similar
the first I am receiving is robotic saying how somebody was impressed by my profile
and then: "ah, remote only, ah no contract"
1
Simpler, so we still use HCL to manage the main zones and we also use smaller projects to to push the specific project zones into the project accounts.
Then the developers just need to look up the zone they when creating a record.
We wanted the zones to be managed but without loosing the flexibility of our terraform projects dynamically updating records when resources change, for example ALBs or anything else that used a DNS name to reference.
1
Since you asked how to do it right.... well the docker way is to have one application per container. Not several applications in one container. Why do you need that anyway?
https://docs.docker.com/config/containers/multi-service_container/
1
Small company, I was the only IT guy at the time managing the infra, the Senior guy left and I was the junior. One of the hard disks of our main VMware server died (most of the company ran on that server) it was in RAID5, so no problem here.
Replaced the drive, and because I "knew" that RAID5 is a able to work even when it loses 1 disk., I had the great ideia to removed a second one, don't ask me why, after I replaced the failed disk.
Of course I destroyed all the server's data, because the new disk didn't have the striping and parity data yet.
Worked like crazy for a week to recover everything, and of course not everything had a backup, so had to figured how to configure many Linux services from scratch (LDAP, mail server, DNS, jabber, etc).
1
Smart. You'll be alright as long as you don't mistake it for a secondary skill. Just remember how hard life in a broken house would be without a screwdriver. Possible maybe but why do that to yourself.
1
So a standard practice is to use count to not run a resource creation in a given instance. Another standard practice is to build modules. When you start building more complex modules and the. Calling them using the count function can conflict with a for each and then your left making a complex web of code just to say if this don’t do that but iterate of this if true. Which isn’t a problem in a normal programming language.
1
So I can deploy a runner into an EKS namespace instead, and this will register back to GitLab and can execute my micro-services pipelines?
Primarily they're just deploying node Helm charts but some do other things in AWS like modify S3 files on deploy or read AWS secrets, so my worry was regarding access to other AWS resources as the Fargate tasks use profiles easily but I wasn't sure how EKS would do this.
1
So I gave two examples in my post of the work we are actively doing but I'll reiterate them more clearly so it's understandable.
One issue my organization has is that all work comes from a variety of different sources directly to our engineers. That can be part of our on-call rotation or requests directly to their favorite engineer. According to some data that was recently gathered, there were well over 50 active projects ongoing for a team of less than 20. Those projects also included active business projects.
To help with that issue, we are starting a work intake process through a ticketing solution. That intake will tag cards on the wall with certain pieces of information like the system it's impacting, the team requesting the change, and the type of work it is. If it's something that isn't planned as part of sprint planning it will be denoted as such. The goal is to be able to measure how much of each kind of work we are doing, which systems are impacted most often, and who is requesting work most often so that we can focus our efforts on improving those systems or flat-out replace those systems.
To help with that issue, we are starting a work intake process through a ticketing solution. That intake will tag cards on the wall with certain pieces of information like the system it's impacting, the team requesting the change, and the type of work it is. If it's something that isn't planned as part of sprint planning it will be denoted as such. The goal is to be able to measure how much of each kind of work we are doing, which systems are impacted most often, and who is requesting work most often so that we can focus our efforts on improving those systems or flat-out replace those systems.
The final sort of more strategic piece of this is to reorient our teams around being a platform team. The current team I work on does everything for everyone and manages most things ourselves. We are trying to reorient around being more of a platform as a service and building paved roads to ue the platform so we can easily onboard teams onto the platform and have prebuilt templates that are easy to modify to bring them on. If we do that coupled with other things that I mentioned it gives us more feedback on the development process and increases velocity much like the book talked about.
1
So I'm starting a new devops windows automation engineer position on Monday. I was a little nervous.. but now I'm just terrified after reading this thread
1
So I’m a cloud engineer with a sysadmin/systems engineer background. All good on the server, cloud, infrastructure stuff but I’d like to implement more devops practices into our environment but having trouble finding spots do it. We don’t have any devs or custom software (we do but it’s power apps and low code stuff). All our servers are domain controllers, dns servers, sql servers (soon to be azure sql db’s or managed instance) or single application like a big name business intelligence server. So I was going to get into bicep or terraform but it’s like I’m really only deploying the server once. And if I need multiple sql servers I just use the MS image from the market place. My next idea was ansible to manage all the windows server vm’s. We have logicmonitor for monitoring. I know it’s a tough question without knowing our environment and business but any ideas of some basic things I should be looking to implement? Also - no containers or docker or AKS.
1
So nothing at all to do with OPs post you just dislike a specific technology and invent reasons why we should to then? People like you are a dime a dozen in our industry. Thats the real tragedy.
1
So sorry for the delay in getting back to you clintkev251. Thank you very much for your advice!
1
So that's why I always have 163 tabs open...
Love this concept and so glad to put a name to it, thank you!
1
So then what will you suggest?
1
So true. Especially if employers want to hire locally instead of WFH. Duct taping things together with Python and Bash, while not mainstream and the snobs in Silicon Valley would laughat us for it, is still more capable that a lot of folks out there can handle, so we still get paid well for it
I'm at the same place, get paid $120k, and am only just now getting into Docker and K8S, and am still migrating scripts onto Ansible deployments
As long as you don't torch a company's data, tech is the place to be baby
1
So what I was essentially asking for is the docker engine api. Sorry for the poor question.
1
So you are working with users who aren't clear about what the end state of the system needs to be. That isn't unusual, but Agile gives you some ways to work with this reality instead of fighting it.
1
So you would rather install docker, mount your folder/fs into it, docker pull multiple versions of a terraform and ensure you run the correct one...
I am at a loss...
Yooo... , if you do these stuff for side home projects to learn docker and maybe k8s, sure, sounds cool... But if you do it to version developers environments because you don't know how to do version pinning or idk what other maniacal plan..... I'll leave it there...
1
So, lately I've been pushing myself to learn my way around AWS and build a static site with a github repo, actions, aws codepipeline and s3, etc. Essentially the cloud resume project but a bit altered, I started watching a series on O'Reilly about AWS, got a few videos through, watched a video here and there, tried reading some tutorials. I got essentially nothing done over the past few weeks.
Yesterday afternoon, I just started building, made a bucket, set up a repo, yadayada, read AWS docs pieced some stuff together and I had done it in a few hours without any tutorials. Just by paying attention and doing it piece-by-piece and looking up what I needed to know. It was easy, and If I had just jumped to doing instead of trying to figure out how to do, I would've been done a while ago.
Just wanted to add a quick ramble, I'd probably learn a lot in those 18 hours of video on O'Reilly but it was nice to spend an afternoon and build something.
1
Some colleges are just starting to use Git. It's a slow moving environment, but then that's the point. They focus on fundamentals (for Bachelors at least).
1
Some folks just can't stand that certifications exist. Most are utter crap and unnecessary IMO. However, it's undeniable that they can open doors and some positions (particularly in gov't) require them.
1
Some great ideas here. I'll have to save this and leave it somewhere so the wife sees it.
1
Some keep their documentation in their code base as mark down which makes it easier to enforce documentation when someone makes a change. Another thing is to establish a compliance framework to kind of light the fire. Make your documentation a selling point to customers by using it to get a certification like ISO 27001 or SOC compliance. That will also force your business to keep documentation current.
1
Some of us are ballz deep into the cloud.
1
Some people and some situations take more time to embrace progress.
1
Some people like to live life on hard mode, spend time and money to make something harder to maintain and likely quite a bit less secure.
Google “AWS shared responsibility model” you want to have as much stuff below the line as possible. Focus on solving the businesses goals - which aren’t on building redis and then patching it.
1
Some say you should not mix infrastructure with application layer. How much truth is in this - you realize after forst few major production meltdowns.
1
Some services do not need outbound nor inbound internet access. For these services, toss them in a private subnet, restricting access in either direction.
And by putting your instances on a private subnet, you prevent inbound public internet at the IP layer. That is, packets cannot arrive at your instances, adding an additional layer of protection against a misconfigured (or lack of) security groups, firewall etc
1
Somehow your post was removed. Do you think you could send me a link?
1
Something else, you mentioned alerting is done by others? Why not setup some monitoring yourself? There’s some cool and sexy choices out there, all free.
Zabbix is an enterprise level monitoring and alerting platform, AND they have a docker container to make setting things up easy.
Prometheus and Grafana are also incredibly popular for good reason. You could use these to give yourself a dashboard of all the important things.
1
Something like [portainer](https://www.portainer.io) ?
1
Sometimes bluntness conveys the strongest message
1
Somewhat a tangent....
You mentioned terraform, my understanding is that it needs a state to actually be operational. Are the state files also stored in git repos?
I've noticed that gitlab can store terraform state.
Testing it out in a lab been on my todo list for ages but I don't have much experience with terraform apart from basic concepts.
1
Sorry for late reply
Well, I have my own servers, so I get what you are getting at. I suppose my goal is to be able to selfhost services in docker on the MacBook when I don't have any way to reach my own servers (Lightning has been a problem of late)
1
sorry for the vagueness, I was reffering to this part:
> "I initially created the CRC a couple of years ago because I was seeing a huge wave of people trying to get "six-figure cloud jobs" off the back of little more than a couple of certifications. This is not sufficient and it contributes to the industry's bad attitude toward hiring and mentoring juniors."
1
Sorry I should have explained. The bot is running on a remote server.
I’m using SFTP to clone changes to it when I push to master, then I wanted to restart the script so it picks up these changes…
I’m using Discord.py
2
Sorry I should have explained. The bot is running on a remote server.
I’m using SFTP to clone changes to it when I push to master, then I wanted to restart the script so it picks up these changes…
I’m using Discord.py
1
Sorry, I'm still a proponent of university being focused on Computer Science, not technical career training. If you get a CompSci degree that includes network fundamentals, set and graph theory, and operating systems then DevOps proficiency at a junior level is no more than a couple if basic training courses away.
1
Sorry, one last thing. Start documenting this journey and setup a blog or something :)
I’d love to hear how it’s going, and I’m sure you aren’t the only one in this boat.
1
Sorry. All I have to offer is the documentation, which is quite good
1
Sounds like a good way to accidentally run up a bill you can’t afford.
1
Sounds like it should be baked into the image.
1
Sounds like someone overheard Infrastructure as Code but couldn't remember it correctly when telling about it... Infrastructure as something... something... Must have been data , I guess 🤪
1
Sounds like you have all of the skills and it's more of an organizational/political problem. You should try to make a relationship with someone on the team you want to join, or the team's manager, and express interest in some of the work they are doing. Not necessarily asking for a job or anything, more of an informational interview. People are usually happy to share the work that they've done, it makes them feel good that someone else cares. It might take time to work that relationship into a real position and it could end up being an impossible task, just based on the company's structure and politics. But you might end up lucky where the team has a position to fill, and they already know you're interested, and you can get that internal transfer...
1
Sounds like you have your bases covered.
"Fun encrypted disks" made me chuckle.
1
Spam
1
SPF Records are added to the sending domain DNS zone. So if you send and email from `example.com`, you want to add it to the `example.com` dns zone file:
`example.com``. IN TXT "v=spf1 .... -all"`
In the .... you can define which ips are allowed to send mails from (mostly your mail servers). If your using [google](https://support.google.com/a/answer/10683907?hl=en&ref_topic=10685331)/[microsoft](https://docs.microsoft.com/en-us/microsoft-365/security/office-365-security/set-up-spf-in-office-365-to-help-prevent-spoofing?view=o365-worldwide), there is a guide for adding spf to your domain.
1
spun up a Cassandra cluster with the intention of running restore tests. Usually NSGs prevent clusters from talking to each other. Turns out this specific vnet doesn't have them. As soon as the restore is completed, the new Cassandra cluster took over one of the live production cluster. Users were confused as their last 24 hours worth of data was missing.
1
SpunkyDred is a terrible bot instigating arguments all over Reddit whenever someone uses the phrase apples-to-oranges. I'm letting you know so that you can feel free to ignore the quip rather than feel provoked by a bot that isn't smart enough to argue back.
---
^^SpunkyDred ^^and ^^I ^^are ^^both ^^bots. ^^I ^^am ^^trying ^^to ^^get ^^them ^^banned ^^by ^^pointing ^^out ^^their ^^antagonizing ^^behavior ^^and ^^poor ^^bottiquette.
1
SQL update with no where clause.
1
SRE is just a fancy name for OPS who can code.
1
SREs should be focused on uptime/HA, performance, alerting, monitoring, and logging. So its very typical to collect metrics/logs and ship to products like datadog/new relic/sumologic and create alerts based on the health of the infrastructure. Then creating automation to act on these alerts or adjust the infrastructure to prevent alerts from recurring.
1
ssh_executeable should work if you set it. You can also set ssh_common_args if necessary.
[https://docs.ansible.com/ansible/latest/collections/ansible/builtin/ssh_connection.html](https://docs.ansible.com/ansible/latest/collections/ansible/builtin/ssh_connection.html)
1
Stand up your own gitlab instance and migrate your CICD jobs to it.
It's utterly trivial and you won't have to retool to something like Jenkins.
1
Start by working as a backend developer a year or two. Learn how a backend system work and why they function the way they function. When you fully understand this you can dive into the ops area (sysadmin/devops/SRE) and improve how said backend system runs in the infrastructure.
You need some real experience in the industry before you take on a devops role. The certs doesn’t matter much compared to real hands-on experience. As an example, I can easily learn how to swing a hammer. But I sure as hell don’t know how to build a proper house.
1
Start with feedback systems, good feedback will help you prioritize other things
1
Start with monitoring, pick a tool, graphana, Nagios, whatever...
Now that requires agents installed in every VM, write an ansible playbook to do that for you.
Then decide on a different project, is there a common language, tool or package installed in a lot of your VMs? Automate that with ansible.
You can also take a look at things like The Foreman to kickstart your VMs for you.
Next time you have to do a repetitive task try to automate, write a script to do the task for you, if it's a 30 minute task you do everyday, spend 3 hours automating and by the second week you are already ahead of the curve
1
Start with understanding how you would deploy the application manually. Then think about how you would re deploy it after a change. Think about how to make that smooth and easy.
Understand how you might support that application- how do you get to log files, etc. How do you troubleshoot problems reported by an end user, or by anyone else that is using or supporting the application.
Understand how database schema changes are managed, and how that can be made smooth and easy. Likewise for changes to database lookup tables and the like.
Think about how you scale the application if that is something that might be needed.
Those are the fundamental things in my mind.
1
States contain metadata and sensitive info regarding your infra. It should always be stored using a remote backend.
There's 0 reason why you should use a local state, committing the state file to your repo.
1
Stay
1
Stay.
1
Stick to your own company. That is what I would do. They saw you have options, bumped it and you already know your prospect. Full-remote is something I would never exchange. They better pay me double I make with guaranteed minimal 3 year contract with buy-out before they can let me visit office more than 2 days a month.
Many ITT already argue that you should stay from different perspective. What I would say is stick to the one that you know the best and gives you the most butterflies in your stomach. Do not underestimate the power of your bowels.
1
Still useful to know for context, thanks.
1
Stopped our core application on about 250 servers, causing a flood of false alarms to all our customers for about 30 minutes.
Don't freehand ansible ad-hoc commands, kids. Or if you do, always set the `--limit` flag before your start writing the actual module/arguments commands.
1
Stories. Beer on me in Finland.
1
Storing time series data in ES is still way more costly and slower than a dedicated tsdv engine. You can't even call the way Elasticsearch stores documents a "series"
1
Story about an accidental `rm -rf /`... in production, which was supposed to be `rm -rf ./`. I stopped it after half a section but it wiped out /etc and /bin and stuff. I had nothing. But I had python. I used python to navigate the filesystem, wget'd binaries from centos, same version. And the same python to chmod and get them executable. That was enough to get the OS "usable" enough to work with the shell and get the data off asap and begin setting up a new system. But at first I felt paralysed, it was scary.
I'll be doing what you suggested. I have a script which sets up Ubuntu 20.04 the way I want and ready for use as a deployed VM. I will make a Playbook to set that up instead. Thanks!
1
Strongly agree with using job postings to identify skill gaps. I'd also recommend trying to do personal development during work hours instead of after work, especially if any of what you are learning would be beneficial at your current employer.
1
Stupidest thing was a name change for a DB in Terraform, which has a bug, opened 5 years ago, which would cause the DB to be destroyed, along with any backups or snapshots. So we literally *lost* all the data.
Biggest was when Terraform decided to delete a S3 bucket policy for NO reason which unplugged it from everything causing the entire website to go down.
You may see a theme.
Since most others used a lot more sysadmin-esque examples the biggest one in that category must've been when I was doing a database migration and accidentally unplugged the Ethernet cable. I couldn't get the session back so I had no idea what progress it was making or if it was even still doing something. I had to periodically check the data which was a frightening 5 minutes.
1
Supplementary question: how to remove old images from the docker container registry?
1
Sure but that can still be managed by a simple bash or powershell or batch script that checks if AWS CLI exists, if not, install it.And just run your steps.
I am sure your computer illiterate friend can run the script with sudo or elevated permissions.
1
Sure.. you can believe what you want. Hey the Earth was flat once, right?
Now that we established you are more knowledgeable, can you please explain your strategy to keep up with all the below DevOps areas?
* Continuous Planning
* Prioritization
* various capabilities within this
* Tracking
* Issue and work tracking
* Continuous Integration
* Version Control
* Development Practices
* Security
* System Architecture
* Continuous Testing
* Test Automation
* Security Testing
* Functional Testing
* Non Functional Testing
* Continuous Integration of automated tests
* Test Environment and Data Management
* Various metrics, defects, etc management
* \- Continuous Monitoring and Feedback
* Telemetry
* Different types of monitoring and its meaningful usage
* Incident Response Management
* \- Continuous Deployment and Release Management
* Self Service Portals
* Release Strategies
* Cloud Infrastructure provisioning, Application releases, etc
* Database releases
* Configuration Management
* Automated Deployments, etc
1
Surely we’re heading towards lambda in lambda. Lambda squared.
1
Switching library would have produced a different data signature which would have required re-processing historic documents so we could re-training downstream models.
Agreed, not lovely. Also agreed that ideally we could have fixed the FD leak.
However the team was experienced in data science (python), aws cloud, and js/nodejs for ui/api. So diving into C programming to understand the image library was beyond our skill set.
1
Sys admin roles still very much exists and will continue to exist for legacy stuff. But it’s an obsolete career path. You can continue do manual ops but you will not be seen as ”senior sysadm” but rather forever junior devops engineer
1
Sysadmins can do plenty of programming, internal utilities, open source stuff, scripting, grabbing dev tickets if nothing's on fire, etc. Pssshh *glorified sysadmins* you whippersnappers
1
System administration, project management, account manager.
1
systemd-resolved is disabled right after the server install. :(
1
Tablespace issue?
1
Take a look at how I used Cloud formation and Terraform to create my infra on AWS, used ansible to configure the OS and install kubernetes via kubeadm, follow by using helm charts to deploy my apps in kubernetes
I have made a series on my blog talking about how I built my blog using these tools. I think you might find it easier by seeing examples.My code is also available on GitHub so you can actually see what it's doing
https://lexdsolutions.com/category/how-i-built-this-blog/
1
Take my helpful award. :-)
1
Take the approach of replacing something that’s pretty cookie cutter in your org with a straight forward ansible playbook. Then you can compare and contrast against the working one, to see what’s missing. That’s will give you a taste of ansible. Then the next server you do you extend on your knowledge by taking things that will need to be everywhere and separate them into a playbook and then your app config in a separate playbook. For your third one see if you can leverage the roles concept to further your base infra playbook. By the fourth server you’ll get it, and have a plan for the rest of the stuff you can chip away at.
I really recommend the 90% approach. What that means is have the servers do 90% of the building automatically but that last 10% that’s hard and unintuitive can be just documented. Why that’s a good approach is you’ll convert a ton of infra quickly while learning the tool. Then you’ll be in a better place and that last 10% that needs doing won’t be such a burden.
1
Tbh I want to work remotely from my shithole country to some US or so company.
I think I gathered enough experience on the backyard to do something more meaningful with my life.
1
TBH it's really hard to work out if someone is talking about their first hand knowledge or just what their team has done rather than what they have personally achieved. The difference is night and day but hard to discern from an interview
1
Tbh, your post title is - at least for me. It's the click bait type that'll surely zeroed down any interests I have to read the article. But yeah, that's just me. You do you. :)
1
Tekton for CI & ArgoCD for CD
1
Terraform because I can read the config file even after people who left company, and did not decided to write them in some random language I might not know in Pulumni.. but Im getting more and more partial to Crossplane. Its same as terraform declarative, but also actively enforcing the configuration constantly, terraform or pulumni does not.
1
Terraform because it is declarative and pulumi is full of bugs, lacks support on many services or features, and the paid support is terrible.
Crossplane looks promising
1
Terraform by a mile. Pulumi is very powerful but creates maximum footguns. Many of the downfalls of Terraform can be relatively easily solved imo:
* Terragrunt to manage multiple environments seamlessly.
* Env0 to shore up the benefits of using Crossplane without introducing operational lifecycle issues. Its drift detection takes the correct approach of "notify, don't rectify", which is better than how Crossplane manages infrastructure imo. Sometimes when changes are made, the path backwards can be more harmful than the drift itself.
1
Terraform cloud is a step back. The moment you use it, you can't use prefixes in your state configuration and are forced to have one workspace per state as part of your design. That is a pretty huge constraint.
Terraform free is much better in this area. All you need is a bucket for the state and you're free to store multiple states per workspace.
Sentinel is OK, but there are alternatives (some free).
1
Terraform in an enterprise setting, due to community (and Hashicorp) support. Pulumi for personal projects because it's awesome to be able to use Python (or your preferred language) in an IaC specific way.
1
Terraform is great for sysadmins who want to keep their jobs.
I pushed for Pulumi so that devs could, you know, develop code in a common programming language and not have to learn how to make a config file in a single-use "language".
Of course, since the "cloud" team was all infrastructure ops guys you can guess which way they went.
1
Terraform is great until you need an If statement that a junior engineer can understand. In order to do an if statement in HCL you need to do a ternary with count. At the time I switched to pulumi the terraform CDK was non existent and I haven't played with it, but pulumi has served me well since so I haven't needed to.
With pulumi I think the most confusing thing is the output type in typescript and how it isn't a native promise, thus other libraries don't work well with it. I understand why they didn't use native promises, but it still is confusing and adds some weird issues to debug sometimes.
1
Terraform is one. Even though most ppl say it's IaC. But think about it, one of the most practical reasons of using terraform is the state file. That is data.
1
Terraform supports a whole load of storage backends.
Terraform cloud works well, and although it's a paid service it's not terribly expensive; and there is a free tier.
1
Terraform x Cloudflare
1
Terraform, Python, Groovy, Bash.
1
Terraform. Because of the job market.
1
Terraform. I’ll take declarative any day.
1
Terraform. Pulumi doesn't add anything new, allows many additional ways to shoot yourself in the foot, and has less community support. If I have Kubernetes cluster in hand, then Crossplane all the way.
1
Terragrunt -- do you use custom TF modules?
1
Terragrunt module for route53, which is later used as a dependency for other resources
1
TF when you just want straight, simple, declarative infrastructure. Most people have this use case. The simplicity, maturity, low learning curve and support win out.
Pulumi if you need highly programmatic/dynamic infrastructure that is difficult/impossible to do with TF. There are cons that come with this increased flexibility. Way more footguns, a code dependency tree that now needs to be security scanned as there are now more avenues for a supply chain attack, lots more room to over-engineer things.
In the long term for 2nd use case, I think CDKTF, while being younger, will probably win out over Pulumi due to Terraform's existing large user base.
1
Thank for your overview! Last time I’ve built an application was really long ago, I’ve actually deployed an application recently on Heroku ( I’ve used it lots of time when I was programming on RoR) and I use Docker as an hobbyist. As a good geek I am I’m still doing some stuff Software related specially on home automation field. But I don’t touch an IDE for a long time. I just want some kind of validation for the hiring companies that I really have DevOps knowledge.
1
Thank you :)
1
Thank you :) Managed to muddle my way through and fix the issues I hit yesterday. I've definitely learnt alot so far in the past few weeks but the sheer amount of stuff to learn can feel overwhelming at times.
1
Thank you everyone for your answers. I have tweaked the HPA a bit with the help of one of you on Github, and i've removed CPU limits to my container (cc /u/ARRgentum). The container does not crash anymore under 100 VU, but i've been able to crash it with 1000 VU. I think I still need to tweak the autoscaler and to always be running multiple containers before running a spike test.
Here are the results of k6 for 10, 100 and 1000 VU:
running (1m01.0s), 00/10 VUs, 395 complete and 0 interrupted iterations
default ✓ [======================================] 10 VUs 1m0s
data_received..................: 2.4 MB 40 kB/s
data_sent......................: 315 kB 5.2 kB/s
http_req_blocked...............: avg=534.09µs min=0s med=0s max=76.45ms p(90)=0s p(95)=0s
http_req_connecting............: avg=147.28µs min=0s med=0s max=14.67ms p(90)=0s p(95)=0s
http_req_duration..............: avg=74.52ms min=28.11ms med=48.9ms max=779.21ms p(90)=172.02ms p(95)=224.66ms
{ expected_response:true }...: avg=78.61ms min=28.11ms med=49.89ms max=779.21ms p(90)=183.41ms p(95)=236.06ms
http_req_failed................: 14.28% ✓ 395 ✗ 2370
http_req_receiving.............: avg=284.15µs min=0s med=0s max=21.12ms p(90)=908.26µs p(95)=1ms
http_req_sending...............: avg=22.07µs min=0s med=0s max=1.02ms p(90)=0s p(95)=79.3µs
http_req_tls_handshaking.......: avg=356.1µs min=0s med=0s max=53.92ms p(90)=0s p(95)=0s
http_req_waiting...............: avg=74.21ms min=28.11ms med=48.53ms max=779.21ms p(90)=171.76ms p(95)=224.41ms
http_reqs......................: 2765 45.346392/s
iteration_duration.............: avg=1.53s min=1.34s med=1.5s max=2.2s p(90)=1.67s p(95)=1.77s
iterations.....................: 395 6.478056/s
vus............................: 10 min=10 max=10
vus_max........................: 10 min=10 max=10
running (1m03.1s), 000/100 VUs, 500 complete and 0 interrupted iterations
default ✓ [======================================] 100 VUs 1m0s
data_received..................: 3.3 MB 53 kB/s
data_sent......................: 429 kB 6.8 kB/s
http_req_blocked...............: avg=16.98ms min=0s med=0s max=834.82ms p(90)=0s p(95)=0s
http_req_connecting............: avg=553.56µs min=0s med=0s max=28.2ms p(90)=0s p(95)=0s
http_req_duration..............: avg=1.62s min=64.31ms med=920.6ms max=6.18s p(90)=3.68s p(95)=4.4s
{ expected_response:true }...: avg=1.67s min=335.62ms med=916.14ms max=6.18s p(90)=3.84s p(95)=4.52s
http_req_failed................: 14.28% ✓ 500 ✗ 3000
http_req_receiving.............: avg=251.16µs min=0s med=0s max=11.76ms p(90)=886.71µs p(95)=1ms
http_req_sending...............: avg=21.39µs min=0s med=0s max=1.19ms p(90)=26.11µs p(95)=81µs
http_req_tls_handshaking.......: avg=4.51ms min=0s med=0s max=389.6ms p(90)=0s p(95)=0s
http_req_waiting...............: avg=1.62s min=63.4ms med=920.5ms max=6.18s p(90)=3.68s p(95)=4.4s
http_reqs......................: 3500 55.462152/s
iteration_duration.............: avg=12.48s min=10.46s med=12.56s max=13.47s p(90)=13.11s p(95)=13.24s
iterations.....................: 500 7.923165/s
vus............................: 13 min=13 max=100
vus_max........................: 100 min=100 max=100
running (1m30.0s), 0000/1000 VUs, 1945 complete and 499 interrupted iterations
default ✓ [======================================] 1000 VUs 1m0s
data_received..................: 13 MB 148 kB/s
data_sent......................: 1.5 MB 17 kB/s
http_req_blocked...............: avg=268.46ms min=0s med=0s max=4.8s p(90)=894.6ms p(95)=2.2s
http_req_connecting............: avg=52.38ms min=0s med=0s max=1.11s p(90)=69.96ms p(95)=92.82ms
http_req_duration..............: avg=9.24s min=14.14ms med=5.26s max=38.73s p(90)=28.97s p(95)=33.52s
{ expected_response:true }...: avg=11.18s min=31.83ms med=8.2s max=38.73s p(90)=31.54s p(95)=34.56s
http_req_failed................: 48.50% ✓ 4001 ✗ 4248
http_req_receiving.............: avg=431.14µs min=0s med=0s max=37.95ms p(90)=889.98µs p(95)=1ms
http_req_sending...............: avg=31.73µs min=0s med=0s max=1.99ms p(90)=36.51µs p(95)=104.95µs
http_req_tls_handshaking.......: avg=195.35ms min=0s med=0s max=3.56s p(90)=645.56ms p(95)=1.79s
http_req_waiting...............: avg=9.24s min=13.63ms med=5.26s max=38.72s p(90)=28.97s p(95)=33.51s
http_reqs......................: 8249 91.645655/s
iteration_duration.............: avg=23.87s min=2.27s med=8.45s max=1m25s p(90)=1m11s p(95)=1m15s
iterations.....................: 1947 21.630997/s
vus............................: 504 min=504 max=1000
vus_max........................: 1000 min=1000 max=1000
These numbers are more appropriate to me, but I think I can still do better. I will experiment and document this in my repository.
Thank you all,
1
Thank you for encouraging me! Yes definitely "selling" what you do/your contribution outside is the most important factor always hehe!
1
Thank you for you help!
1
Thank you for your answer I will check Drone out. Sounds like a good fit. I do not want to battle UI and rather work in ymal or json files.
1
Thank you for your answer, the culprit was probably too aggressive CPU limits. I've removed them and posted the new results of the load testing in a comment.
I still have issues but i think there's more tweaking to be done on the HPA to get acceptable results. I don't want to alter the test by having the load increase more gradually: i'm trying to learn how to take on a "Reddit Hug of Death" moment. Are there any practices I should know of about how to handle spike traffic ?
1
Thank you for your answer.
Do think there are some must-and-should concepts that I have to learn to get recruited into a DevOps role?
It's very confusing and uncertain for me, I'm not sure what to learn first.
1
Thank you for your support and the reminder. It might be a bit of imposter syndrome but also its more of a planning for the future lets say. I am trying to be honest with my self without letting also the fear of the imposter syndrome take me over.
I know that i dont enjoy algorithms, development in a analytical way of thinking etc and that i like configuring stuff or playing with servers but thats it.
So, i am wondering what should the proper steps to take advantage of the support of the company and the education that they are willing to offer me.
Asking them and saying them that i want to go for development it wont be beneficial for me and for them.
But is also the risk that it can be at the end a turn over for them if i am to open about how much i suck on development and how much i dont like it also...If all or the majority of their projects are mostly development its a problem...so i have to put some diplomacy there and be careful how i ll handle it.
1
Thank you so much for this post! Its so kind of you supporting me and advising me like this. Its really appreciated. Your advices also make a lot sense and its true that since i dont enjoy programming i should still practice a bit (at least very basic automation stuff) but also the same time trying to take advantage everything the company has to offer in terms of educations and training even in other tools and not only programming.
Yes i know if dont like programming i am kinda going to push out my self from DevOPS but i just hope if still i can cling my self on this company on a different role or at least different tasks even if in the label i ll be DevOps
1
Thank you so much for your input! And i agree that depending the company or the project DevOps doesnt have a specific right way that it has to be done thats why i am trying to find a solution for me (someone who doesnt like much coding or algorithms)
Another big obstacle for me its my night owl habits...now i am lucky and i negotiated hard with my team and pushed the (sometimes annoying imo) daily stand up meeting later in the day but i wonder if in long term i can establish it also as a de facto victory within the company and getting more support from upper management about my night owl nature.
So far i dont have a problem but if i want or they suggest me to change project/team sometime i might have a problem since i ll have to bring again this issue in the table (that i cant handle waking up EVERY DAY early the morning.)
1
thank you very much sir/m'am. I appreciate the effort, expertise and your spirit.
1
Thank you very much, I will take a look at this. I need to get my hands on some equipment to start experimenting with these things.
1
Thank you very much!
1
Thank you very much. I think what I will do first is to setup some test labs and practice with Ansible and writing some Playbooks to deploy some stuff, and get to the point where someone else said that I should not be stressed by destroying a machine because I can just create it again with 1 command.
Then I will look at containers, and go from there. Thanks for the in-depth comment!
1
Thank you you just reminded me I need to run a terraform destroy on my azure test environment.... Nothing too big but a $300 a month as wafv2 and container running Atlantis and infracost ..
1
Thank you, I will check these out and have a read!
1
Thank you, I will set up a reverse proxy. Do you have any recommendations on how I should go about doing it? A lot of the guides tell me to use nginx but I feel it may be a bit overkill for my application.
1
Thank you, it worked 👍
1
Thank you, that's really useful. This does give me confidence that Heroku is the right option for my requirements, and I'll definitely have a look into the CDN suggestion.
1
Thank you, this is good advice and a good starting point. I like the term "think of servers as cattle, not pets". It shows to me that I think of my servers as pets right now. Actually they are almost like family to me at this point, haha.
> Having the box destroyed should not stress you out
This should be the first goal, I think. Because just being in this position is peace of mind. If everything is disposable and easily redeployed, with data backed up. That's already a long way ahead of the current.
Thank you!
1
Thank you, wholly agree. As I've been chatting with businesses it's become clear that quite a few are just chasing VC speculation, so developing a new hammer that there are no nails to hit. So they want you to build something without knowing what the use-case is but with certain buzz-words checked off.
Others have a more clear idea of the problem they're trying to solve in the sector, and those guys are the ones that you start to realize that the underlying tooling will be similar, but need to think in a little bit differently. Your MVP doesn't exist in a single-stack but in a very diverse set of services (including in-house as you mentioned).
1
thank you, will do!
1
Thank you! But in a practical way how i can work on them? For example i should tell my manager that i want to focus on Prometheus server administations , maybe also Ansible if it is on some projects etc?
He told me that he ll look around what projects are going on and what are the plans in the future for my team so i can find stuff that i like and i want to get better but i know since i despite kinda the development it ll be hard to find something...
Also i have to be honest. I like Operations or System admins since once something is up and running you can chill a lot or i might configure something or do a push on the git and then rest for a few hours till i get a feedback back
1
Thank you! I will look into this
1
Thank you. I should try to see that with them, if I can recover the hours of today but I am not sure about it.
Not having a doctor's note is putting me in a bad situation.
1
Thanks
I havent create cert yet.
Im thinking about the correct way to do.
1. Create cert on host or?
2. Create cert into container?
2.1 customizing docker images or?
2.2 using docker-compose configuration?
1
thanks a bunch
1
Thanks a lot.
1
thanks bro, I will do my best.
best wishes for you too.
1
Thanks bro, much appreciated, less for AWS, more on CI/CD and containers
1
Thanks for drawing light on this! The issue of people working in DevOps not being able to code-has many downsides. Configurations have become as complex as code....
1
Thanks for highlighting Timescale u/lungdart
For u/zuxqoj TimescaleDB is packaged as an extension to PostgreSQL, and if you manage your own software stack (on-prem or in the public cloud) then the software is free to use and modify. If you need cloud-based managed services, then first stop might be to talk with Timescale.
Happy to go DM (*I do, of course, work for Timescale*) to see if we can help you out here. For a quick intro, then the YouTube channel is a great starting point: youtube.com/timescaledb
1
Thanks for sharing your experience. It might be worth it to ask for some consultancy to get the the things done right fr the beginning. As you said - experience counts and I don't have any aside from some home lab projects for my own stuff.
I highly doubt the company will go with such an approach but I'll definitely going to propose the idea!
1
Thanks for the advice!
1
Thanks for the advice. For work I’ve already built apps using EC2, though the actual code was already existing.
That’s the other interesting aspect about serverless that I’m hoping to see in more depth. I feel like when you compare an app running on a VM to one architected with serverless in mind, the serverless app will have a lot less code. Do you think that’s usually true?
1
Thanks for the clarification; it looks like I misunderstood.
Silent/hidden errors are what worry me. If those do not happen more by default, then it's okay.
1
Thanks for the confirmation. Can you give me insights into setting up the home lab?
1
Thanks for the in-depth message. The takeaway I get would be learn Ansible and use it to get monitoring going asap. Yea I agree with this.
I'm going to setup some lab with a few VMs for learning and playing with ansible, and I will make Playbooks actually to deploy those VMs and create them, do initial setup etc. Then I'll move towards monitoring and getting Playbooks for the agents, and go from there.
Thanks!
1
Thanks for the info. The "where to draw the line" thought was more around when/where to use Dagger vs vanilla CUE.
It does sound like there's enough of an ecosystem in place around Dagger to be able to leverage that? Or would it still make sense to use vanilla CUE in most cases?
1
Thanks for the link though
1
Thanks for the response! Yeah I will definitely be hands on. It’s all OpenStack so there’s tons to take in and I’m insanely excited.
1
Thanks for the suggestion! I'm definitely wanting to make the `README` a little clearer, so I'll probably implement this 😊
1
Thanks for your deep insight on the topic! very much appreciated
1
Thanks for your reply. I guess the way I was looking at it is that now my app is ready for launch, and while I have the 'luxury' of having no users that this seemed like a good time to get everything on the most ideal platform in terms of allowing it to grow. I've been at enough companies over the years undergoing platform migrations that the cost/time/hassle aspect makes me want to avoid it. But I completely take your point, I should be focusing my time on The Thing and deal with that problem (not a bad problem to have) should it arise if things do succeed.
I do like tinkering and learning stuff, which is likely the problem here. My overall takeaway from the replies here is to leave as is and just get launched so that has been really helpful.
1
Thanks I will look into it
1
Thanks I've got a few courses that I've been trying to get through. Do you find the udemy courses good?
I've just been using pluralsight
1
Thanks mate, yes this makes sense.
1
Thanks the manager is new aswell as he started last week. The person who interviewed me left just before I joined. I was very open in the interview that I had little experience with kubernetes and would need some time to get up to speed.
Will try talking to the new guy to see if I can block out sometime with someone to over the areas I'm struggling with.
1
Thanks u/Dry-Republic-9554 we are considering Timescale DB but they don't offer support for on-prem deployment and we don't have any experience of working with Timescale DB so far, for this reason, we are a bit hesitant about using it.
1
Thanks u/LoriPock, we have to deploy our product on-prem only, this is customer requirement.
1
Thanks u/nroar, any idea if they offer on-prem support?
1
Thanks very much for pointing this out! I really appreciate the link, and my apologies for replying slowly.
I was aware of the existence of the CLI but I didn’t know this was one of its functions.
1
Thanks very much, I'm definitely going to take a look at that repo, and the steps you listed sound like the perfect way to get started with Ansible in practice.
It is a scary thought to tell Ansible to do stuff and not care how. I have to get over it and trust it, and that can only be done by writing Playbooks. I'm sure I don't even need custom modules, but could make some eventually. I will keep it in mind then to avoid simple shell commands!
1
Thanks yeah I've been trying to remind myself I picked this company exactly because they was using some of the tech I wanted to learn.
1
Thanks, I hate it.
1
Thanks, I will do some more research on logs and see what I can come up with. I would prefer to avoid ELK since it's a beast. Cheers!
1
Thanks, I'm running on Ubuntu. So I should be looking for resources that show me how to create a service on Ubuntu?
1
Thanks, makes sense!
1
Thanks, though it's difficult to choose a tier, pricing tables are usually intentionally confusing and real bills might differ from expectations
1
Thanks!
1
Thanks! Monitoring is a work in progress. We need way better health checks for sure. Dynatrace is doing a nice job with the distributed tracing though.
1
Thanks! There are definitely some great examples of at-scale use. If anyone is at an early stage of review with TimescaleDB in the mix I'm happy to DM to exchange links etc.
1
That adds up, thank you.
>have a look if Azure Pipelines allows for conditional build steps.. So you can skip the script execution if it isn't there.
That's what I ended up doing; the branches that don't have the script just display an alert saying that they will have to be webpacked manually.
1
That all depends on the role you're applying for, and whether the company has a requirement to understand kubernetes.
1
That doesn't preclude using the cloud, just use more of a pull based model
1
That is good to know since I rely heavily on parallelism feature of gitlab.
1
That is not my call to make tbh. I can just advise them. But I am not even sure if this is the cheapest solution. We would pay around 2k per year, tendency growing. I think I need a week to 2 weeks to migrate, so it seems worth it, especially in the long run. I think we would not even consider switching. At bitbucket, we would pay ca. 30% of what we pay at GitLab.
1
That is why there are constructs with sane defaults and automatic least permissive network settings and so forth.
1
That it doesn’t write itself
1
That just isn't how massive scale infrastructure works. Especially as a platform provider. Upgrades to newer versions have to go through a lot of testing and certification. The tooling component isn't a significant issue, the support and stability are an issue. There can also be specific changes even within a minor version that might complicate the upgrade tooling. How you run it at home or even in a small company in no way compares to what it takes to enable new versions for thousands of customers running tens of thousands or millions of nodes. The support risk generally outweighs all other risks unless the attack vector is extreme and has literally no other mitigation options. The nature of the problem changes with scale, and, user base / customer footprint.
1
That looks fun. I’ve spent a lot of time thinking about this sort of stuff as well (scale from zero). So many chicken and egg scenarios lol, especially starting with bare metal (I have worked on this doing platform architecture work).
1
That makes sense, thanks. So I suppose this is the real root of the difficulty:
>...because those branches don't have the script yet
And I should have just done this in such a way that I could push the change to all branches. Seems obvious when you put it like that.
1
That makes sense! Thanks for the detailed explanation :)
1
That people still document “diagrams” instead of considering monitoring and metrics the documentation.
Written docs should API, executable examples (like Python doc test, Go examples, …) and requirements specification
1
That seems like a functionality that would be built into any CICD pipeline, unless they reinvented that too.
1
That sounds like a possible approach. I'll try it out, thanks.
\>I can only possibly imagine 3 different html templates, them being dev, (staging?) and prod.
I have Dev, Test, x number of Feature branches, and x number of Release branches (different customers may be on different releases). But it sounds like committing this index.html file may be in error in the first place.
1
That sounds like ansible running some cmd with a misplaced | .
1
That sounds pretty cool. Do you have an example of doing something like this that you could share? With this approach, I take it that the configs live in some sort of a monorepo? Does dagger.io fit in anywhere or is it more for CI? Trying to figure out where to draw the lines using different tools.
1
That would be way too much overhead
1
That, and the self-initiative to use it.
1
That's a good reason, but definitely not the only one.
Having the state in a supported tf backend allows a multitude of operations (mostly CI/CD related) otherwise not possible.
Locked state is a good example.
1
That's actually a pretty encouraging thought
1
That's awesome
We're re-evaluating how we access a lot of our systems at my new org, and I'm trying to push for using it. They've made a lot of positive changes since this post
1
That's interesting, didn't know about that in IPv6
Do you have any references explaining that?
Thanks!
1
That's not true. From scratch I can:
1. create an [EKS cluster](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_cluster)
2. configure the [aws_eks_cluster](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster) and [aws_eks_cluster_auth](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) data-sources with attributes from the EKS-cluster resource
3. initialize the kubernetes-provider with attributes of these data-source
4. create resources of type kubernetes-*
You don't have to believe me, but this is literally what my team does several times a day.
1
That's only if he puts in place autoscale, no?
1
That's really cool indeed. I'd like to somehow try and use Terraform but if I'm sticking with bare metal I don't really see how it could be used. Although I am using VMs now and will use VMs to test and learn, and I could learn Terraform just for fun but I don't think in future it will be on my use list. I will have to see!
1
That's true!
1
That's what I also think - maintaining stuff and keeping the seniors back free. I'm afraid if I take over this project I'll be treated as the "expert" of the topic and then have to do all this with no promotion since "I've done it in the other position already". My employer is not promotion happy and the company's general attitude is to do more work with as few people as possible.
1
That's why I always say: default configuration should be more secure, but less. But most devs create easy defaults to onboard users quicker...
1
That’s a good start!
1
That’s an interesting point. I haven’t even thought of that. Even so, it’s not just recruiters. Like, I have had people in tech from local companies who aren’t recruiters come out of the woodwork to connect with me. It just seems like when you mention “Kubernetes” its the new “it” tech people believe will transform their organization. I think it helps but it’s also not that alien. Seems odd.
1
That’s it’s not versioned and/or codified.
1
That’s not true.
1
Thats true but also thing is that i dont like development at all. Especially when you have to deal with projects that they are already quite some time on and you have not only to write your own code but also to understand other peoples code.
This can be so hectic and honestly i dont want to find my self on this career path...Thats why i wonder if i can survive as a Operation guy like you know the old school admin
1
The “cattle not pets” thing is a mindset very important for containerization. I recently just did a Kubernetes upgrade in all 6 of our environments and ended up needing to blow away the whole cluster each time due to one of the terraform modules we use causing trouble. It’s amazing how easy it was to destroy the entire cluster and all of its services, 6 different times, with literally 0 issues whatsoever in each environment. Came right back up without a second thought.
1
The Atlassian software suite seems to be the defacto standard for managing Agile projects, especially Jira. My last 4 orgs used them, startups and enterprises both. Based on your requirements, you basically only need a Scrum board with different stages for tasks and you could even do this in Excel if you wanted to. Where Jira differs is in the 3-party integrations it provides with Github, etc.
If Jira is too pricey I'd say you can make do with [open-source or free Scrum boards](https://www.atlassian.com/software/jira/comparison) as well. Trello, Asana, Pivotal, Notion, LucidChart all work very well. But like others have pointed out, please remember that Jira won't automatically make you Agile, and in my experience it often has the opposite effect. I'm not a Scrum manager but others in this post have some great advice on how to proceed.
Good luck. And welcome to the dark side :)
1
The best projects I've worked on had good SQL mocking for unit tests. Once you've hit that point you can generally easily inject that data into your dev environments for functional testing and you get to manage the environment cleanly with your code repo as developers make changes.
In an ideal world I'd like to database version control using a tool like Liquibase to empower the developers to manage database evolution, but the projects I've been working with aren't quite there yet.
1
The best way to learn how to build a pipeline from scratch is to literally just build one from scratch. Pick a platform (github/gitlab), pick a cloud provider (something cheap like linode or digitalocean), and write a pipeline to deploy a simple web application
1
The biggest gotcha you'll hit is that there are a million services and even little parts of services that aren't supported. For just kubernetes you should be fine, but don't expect to just copy over your existing terraform or anything as you'll likely get caught out by stuff like KMS
1
The biggest reason to separate out your frontend is independent growth and deployments. You could add in a second backend later or deploy a new feature to the frontend without deploying an updated backend. But for simple monoliths I like to combine them. Some people will say you should put your frontend in an s3 bucket and host it from there, but you can easily cache your static assets in a cdn with your current setup, so your express server is rarely ever the one doing the hosting.
As for moving to azure, this sounds like a possible option, but not a requirement. If the app becomes too much for your current setup you can always move it later and just repoint your dns to the new location and it should work out. That said, if you run into scaling issues Heroku likely won't be the bottle neck, it is a solid production platform. I think the biggest reason to move to azure is for the experience if you need it, or to co-locate with other assets you might have in the area.
The containerization method used by Heroku is similar to but more optimized than docker, so the only reason you would dockerize your vue setup is because docker is already in your infrastructure layer, such as running in kubernetes.
All in all, your currently deployment sounds fine, but as a business person you should already start planning the move, know why you would need to, and be prepared to do so if the need ever arrives, though it likely won't.
1
The Clash
1
The comment from u/sillygitau was mostly on point, but since you mentioned Linux VMs and being afraid to get lost in this ocean of tooling, I would suggest against trying to use too much tooling for that.
Just to reiterate, making sure you got the basics right: containers are 'short-lived' and should not be tampered with / modified at all. Once a container is built and deployed, the only thing you should be doing is troubleshooting and securing information which devs might need for a bugfix.
**You do not modify a running container. Containers are not VMs.**
Dev/Container lifecycle:
1. Changes are made to app source code
2. App is compiled or bundled together (seeing you mention PHP)
3. Container image based on the Dockerfile is built
4. Image is pushed to a registry
5. \----- deploy: -----
6. Container instance started
7. Container happily running
8. Container stopped / terminated
9. Delete / cleanup
So, with this clarified, what you could do next is:
1. Get devs to try out the dev-build-push workflow locally
2. You have to figure out how to deploy manually
3. Start automating the whole process step by step
For the deployments the simplest thing would be to have a deployment script which pulls a container from a registry, stops running containers and starts a new one. In order to automate this, you could then run the script remotely over ssh. So far you wont need any special tooling and the process will already be better than what you have now. Further improvements would require you to start using platforms / tooling that will help you with this, but dont focus on this too much in the beginning.
Also - ask devs very very nicely to use git.
1
The company I have just move from didn't use kubernetes and used technology I am familiar with on the technical side from when I used to work at a tech startup. However, they didn't have any real DevOps or SRE practices and had only just restructured their ops teams. So I ended up spending alot of time on planning stuff at an epic level such as getting from manual deployments to automated deployments, better logging. Improving infrastructure architect to promote higher availablity.
That meant though for the past 2 years I've not spent as much time doing the doing or the actual coding.
1
The cool kids are doing this in docker
1
The country store
1
The devops handbook and optionally the phoenix project.
1
The fact that you’re this concerned is more impactful than any disciplinary action that would be on the table.
If this isn’t the norm, I wouldn’t sweat it too much. Be honest about what happened, own the responsibility, and offer to make up the time.
1
The first line of my summary is “I’m going to ignore your DM If it doesn’t offer remote” and I still get borderline offensive “hey I noticed you don’t *truly* want a JOB” follow-up messages
1
The first time I made a route table change on a subnet in AWS I replaced the default route (the one pointing to 0.0.0.0/0, basically all external, public traffic) with the route I was intending to add, on our primary subnet in our production account. I was on a call while doing this, and didn't notice the alerts streaming through our alerts channel. Basically, we were still receiving requests from our customers, but our servers couldn't respond back because the response packets had nowhere to go. 30 minutes into the outage, the VP of engineering came into the conference room and asked for all hands on deck for the outage. I suspected my change almost immediately after seeing when the alerts started and it lining up with my change. I went and checked the route table immediately and sure enough the default route was missing. Luckily it was an extremely easy and quick fix and it was back up in less than a minute of me being notified. I didn't get yelled at or anything, learned from my mistake and wrote up a thorough post-mortem, not offering excuses but pointing out the fact that AWS puts the default route on the very bottom of the route table when you go into edit mode (back then at least), and how a route that points to 0.0.0.0/0 looks like a placeholder for the a new route to be added, especially if you're not familiar with the route edit behavior in AWS.
One other one, in the beginning of my IT days, I was working on the CEO's Macbook troubleshooting some issue that I can't remember. This involved deleting a directory in one of the hidden Library directories so I decided to do it from the command line rather than figuring out how to view hidden files and folders (I was still new to Macs at the time). I started typing the command, using the full path rather than the relative path "rm -rf /Users/ceo <enter>". I accidentally hit enter instead of tab somehow after pointing the rm command at his user directory. I immediately held the power button down but it was too late, his system was corrupted. I had to spend the entire next day manually grabbing what I could off of his drive and copying it to a new laptop. Dude was absolutely pissed but didn't fire me for some reason.
Last one was on my 2nd day at my first job in IT. We used freenas on our fileserver where we stored everyone's home directory contents, I had zero experience with freenas and fileservers in general. While I was creating a new home folder for a new employee, I literally transferred ownership of EVERY employee's home directory to some random dude in finance. One of the senior guys spent half the day fixing the ownership issue manually.
1
The folks I work for use Gitlab right now.
My last job was a Jenkins shop and it was terrible. I'd previously used it for small envs but this was a major enterprise build env and it SUCKED. Maintenance and performance nightmare.
Before I left them I was looking into Zuul, which looked very nice and much more manageable than Jenkins.
1
The immediate issue here is going from 1.22.2 -> 1.22.11 is not possible on their platform.
Eventually when 1.22 is no longer receiving patches, not being able to migrate to the latest 1.23 version is a problem.
They house a pretty fat attack vector relatively speaking, if some cluster breaking issue were found with v1.22.2 right now.
1
The issue here is that many developers think they can do it all. And in the end they can. But they often miss all the other factors on the way like security, conformance and the big picture of how everything on the infrastructure layer is intended to fit together. Yeah it’s easy to boot up something here and there. But that just leads to chaos when the company grows …
Having been on both sides I noticed how my priorities and my focus changed to different things. No way I did everything perfectly.
1
The lack of it.
1
The low hanging fruit are usually things like downloading every jira ticket to do some Natural language processing or clustering to determine what the biggest bugs are/which servers etc.
If you have a data science team, you can become friends with them and work on projects together. I’ve always worked at companies that do AI so it’s easy to find something I can influence in devops to help them.
It’s not like your manager is going to let you set up a model but a lot of the time, I’m using math to quantify why someone’s idea or architecture doesn’t scale the way they think it will. A lot of combinatorics and big o lol.
1
The main advantage of Docker for work environments is that they don't pollute your main system with tools you may only use occasionally.
The second benefit is that they're portable and shareable. That means you can take your environment to any computer, and share it with your friends on Reddit!
Docker workflows are dope
1
The main difference I have found is that any provisioning automation you might want to re-use will require updates, as AWS China has different ARNs. Instead of :arn:aws:.... it is :arn:aws-cn:...
Another difference is that the two available regions (Beijing, Ningxia) are run by different companies, and different AWS services may or may not be available in a specific region.
Also be aware that services may lag behind when it comes to features. An example is AWS Organizations. In CN you will lack a lot of features like Service Control Policies for example.
The most annoying issue I have run into recently is that AWS ACMs (Amazon SSL certificates) do NOT work with CloudFront in China.
Hope this helps a bit. Feel free to DM me, if you have any specific questions.
1
The manager probably opened a position for these strategy decisions to delegate it.
I don’t feel it’s what a manager should do
1
The most practical aspect of DevOps is to automate all your "personal" repeatable tasks.
Does not matter if you are a Developer, A QA tester, A network admin, A systems Admin, A Security Engineer, A Business Analyst, A project manager or any thing else in the chain.
If you can make your own work easier with automating repeatitive tasks, it will give you more time to focus on the quality of your code, security of infrastructure or whatever else makes your product better. Everything else is a BS and a jargon to blow up people's ego or put up a veneer on the BS in the process.
If the downstream team needs your end product in a certain way, automate it and send it to him. Unless it is a genuine issue and leads to more problems, If you resist and waste time trying to prove why they are wrong. You are the blocker in the DevOps process.
This is the chain/bridge part of your DevOps process.
TLDR:
* Make your personal work life easier by automating repetitive tasks
* Focus on maintaining good communication and coordination with your upstream and downstream team.
* If you can effectively do these 2 things, you are doing DevOps.
* Everything else is tools which come and go and keep changing with a
* change in process,
* change in management, and/or
* change in technology.
1
The name is terrible, the product is good
1
The name stands for " YAML ain't markup language". As mentioned, it is a data serialization language. There is nothing to "mark up". Per their spec:
> YAML™ (rhymes with “camel”) is a human-friendly, cross language, Unicode based data serialization language designed around the common native data structures of agile programming languages. It is broadly useful for programming needs ranging from configuration files to Internet messaging to object persistence to data auditing
Op said "programming language", he never specified they had to be turing complete. A programming language is any set of rules that converts text to machine code output. As such, yaml is included.
1
The Netherlands are in the middle of a major housing crisis.. I moved back to my German apartment from there, because as a single person household, I couldn't borrow enough money to afford to buy a reasonable home that wouldn't incur a lot of work or cost for renovating..
Things are moving out from the coastal region between Amsterdam and Rotterdam towards the country center.. Utrecht has become a booming startup city, because it has a big university, it's the central railway hub for the Netherlands and has a nice vibe..
In Germany's bigger cities, it's similarly bad, but renting an apartment is a lot more common, so you might be paying the same sum, but in instalments. But compared to the Netherlands, the infrastructure seems worse sometimes.. A very analog administration, having your own car is preferred over public transport, the city centres are dying, the healthcare system has been pressed for profits for years now and it shows...
Out of those two countries I've lived in, I'd opt for the Netherlands, but other people probably have something to say about the other countries on the list :)
1
The next evolution is management usually…
Speaking of, just found out I’m being interviewed internally for a manager position.
Later nerds, I’m going corporate!
1
The obvious question is where this “one request at a time” requirement come from. Is that really needed?
As you said, the 3 nginx pod may not share round robin state. There is a lot of ingress annotations you can work with to customise. One of them allow you to fallback on kubernetes service for discovery. I don’t recall the exact name of that annotation but that might solve your issue by falling back to dns round robin.
1
The one that got me fired: I installed a backup agent on THE production database that ended up killing all current sessions. 3 days before xmas.
The one that didn't get me fired: I accidentally ran sysunconfig on the NIS master because my mouse was in the wrong window. FFM 4lyf.
The one that should have got me fired but didn't: I stopped one of the prod Oracle database to do some disk maintenance. Turns out, wrong database. I did increase the disk space on both and it saved us and outage later on but my boss was not amused.
I've been doing this for almost 30 years at this point, if there is any one thing I've learned, Never work tired.
1
The only reason to multi cloud is to light money on fire. Both in cloud service providers and labor to deal with additional complexity.
1
The only way to learn programming is to program, get stuck and get yourself unstuck through online research. There are several free or cheap online resources for that and many have active communities of learners helping each other out.
1
The OP was talking about a big company and they usually have big processes ;)
Including dedicated people just for bookkeeping and they don't want to have to touch all of the infrastructure, including AWS and anything left inhouse and god knows where..
In a small, AWS-only shop? Sure, just pull the data from AWS real quick.. In an enterprisey environment, they will probably insist on using "their" tools :)
1
The people that write the tutorials for AWS. Probably nobody else.
1
The Pheonix project provides the story that hooks you, the Handbook provides the instructions, and Accelerate provides the Research and Methodology
1
The Phoenix Project
1
The point is you can write a terraform module that is pretty unreadable. This is not a privilege of Pulumi only.
1
The point was that Drone allows runners on OSS. Nowhere here does it say anything about Enterprise licensing being needed: https://docs.drone.io/runner/overview.
1
The preference is to have the build script in the version control because then you get version control for your build script which is a huge advantage. You should be able to make a separate branch with just the build script and then merge that branch into the branches missing it.
1
The problem is that the theory behind it will not change as much as the specific technologies used. But yes, they probably should anyhow
1
The problem will keep existing, just the title and pay will change over time. I personally believe that traditional platform/DevOps teams will go away eventually when developers finally figure out that our job isn't that difficult. Plus, it'll be to their benefit to know exactly how their applications run.
1
the proper way would be, to have a proper deploy script on your remote, that handles starting the bot in a non-blocking way. ie write a systemd (or any other initd) service, that handles restarting/starting/stopping the bot behind guvicorn or whatever. then have the deploy script simply call the service to restart the bot and call *that* deploy script from your workflow.
1
The python image is usually much larger
1
the question is, whether I have to add the SPF Record in google workspace or AWS ?
1
The right way is the way _you_ learn best.
There’s no single correct answer here
1
The root issue is BGP. It’s not a good protocol it’s useful but it’s old and lacks failsafes.
1
The S3+CF is just hosting the static front end assets I think - there won’t be uploads.
1
The saying goes practice makes perfect, but I prefer to think of it as repetition makes things easy. Those other people just have more reps than you.
1
The scenario is I have an app called MyApp.
MyApp-Instance1, MyApp-Instance2, and MyApp-Instance3 all get deployed at the same time. They all have the same database migration script because they're all instances of the same app. But all instances of the app work on just on single instance of the same database.
1
The secrets issue is a little bit orthogonal. Sops is great for storing encrypted secrets in git, because you can commit the encrypted stuff to you git repos and then control access via the permissions on the encryption keys.
https://github.com/mozilla/sops
1
The sheer audacity. The immodesty. The conviction that I must be speaking in facts. The complete lack of self-awareness.
I've no idea why these technologies have failed for you, but considering that they work well for many other people yet not you, I think I see the common thread here.
And you likely never will.
1
The state should be stored outside gibt because it can contain secrets. Using a cloud based object storage with a locking table is the recommended approach.
1
The thread is about cargo culting, where do you think you are?
I don’t even dislike those technologies. I think they’re all rad. I’m allowed to have a nuanced opinion in which the appropriate level of complexity of problem should be matched in its engineered solution.
I don’t know what your problem is (as in, why you just jumped to being a _dick_). Is it because you thought you had a “gotcha” when in reality you were misinterpreting what the Spotify example was about? Or did you just realize you may have prematurely introduced a technology at too small of a scale? or did I mention a tech of which you’ve been accused of having latched onto foolishly like the cargo culters?
I was completely fact-based in my opinion, I don’t know why you’re taking it personally. So what if you prematurely optimized? We all do several times throughout our careers.
1
The tool youre looking for is called AWS CLI. Switching accounts is actually built into it. You should create a role in account A and grant account B permissions to assume this role. Now during or prior to deployment script execution, you simply switch roles.
https://docs.aws.amazon.com/cli/latest/reference/sts/assume-role.html
1
The tools you mentioned are here because they solve a different usecases.
Like multiple phisical environments vs workspaces in terragrunt. Its not to cover terraform limitations but to extend it for different use-cases because terraform is an opinionated tool..
1
The true continious delivery (of disaster).
1
The work to setup a proper ci/cd is fractions of the work it will take to fix eventual bugs, deployment issues, infosec, patches, deployments, etc. Also dont even think about scaling...
1
The worst is when people write way too much. Be concise or your readers will _not_ read it; the vast majority of people are skimming and what you're writing about isn't interesting enough to be a special exception.
1
The worst thing about Terraform/HCL is that it's basically un-refactorable without major infrastructure teardown and rebuild.
1
The writing wasn’t for me. It came off more like a “Just So story” rather than something believable. I get frustrated when people read it and think they’re the next business superstar.
1
Their Git repository is great, but their pipelines or CI module sucks. (Again it is contextual, but in my experience, it did not fulfill many of our requirements)
1
Then revoke their access to your board...thus is a managing /org issue not one with the tool...
1
Then you get the experience and move somewhere else.
1
There appears to be some missing information. What is happening instead? Does telegraf refuse to start at all, or does it just keep running while mqtt input is disabled? Maybe what you're experiencing is the expected behaviour and some other workaround should be implemented.
1
There are limited opportunities out there for juniors, there's no good systematic approach across the industry for hiring and onboarding them, and so the best signal boost I know of to increase your chances of getting hired into a junior cloud-ish role is to Build Something Real.
1
There are no manager, there is no team, it's just me, automating deployments so they don't all get dumped on the one person that actually validates that the application starts before releasing.
On the other hand, I have free reign to implement any tool I see fit and work 36 hours with overtime as PTO.
On top of that, every month or so, I update the documentation for what I have setup. We now have about 30 ansible roles (reusable blocks) and 80-100 playbooks (deployments).
The way I survive? Automate everything. Deploying something, anything is always the same command :
ansible-run project/deployment environment
ansible-run devops/ara staging
That script starts a bw-agent container, makes you login through cli on first run, connects you to hashicorp vault by fetching your password from bitwarden, fetches your private password through bitwarden, setups env for ara, runs ansible with the correct parameters, unlocking vaults using a client script fetching from hashicorp vault or bitwarden.
Every component runs in a container, the only requirement is having wsl setup (base image is also built so everything is preinstalled).
1
There are reasons to run multiple processes in one container, I have done this in the past with postgres and repmgr, but it is always a little... janky.
I used supervisord for that case, but I think your link covers the topic pretty well.
1
There are some ways around NATs. Don’t assume they give you protection.
And the security required to do the equivalent with publicly routable space is 1 simple “deny all inbound” rule. So not terribly complex (I’d argue simpler than NAT setup almost).
1
There are things like double-encapped packets and other problems in particular implementations.
I.e. I wrap a v4 packet with a private destination in another v4 header, sent to the public IP of your edge router. The router gets it, strips the outer header, finds another packet inside addresses to a private IP, and routes it to that private IP on the local LAN.
For sure such attacks are rare and implementation-dependent. But they do exist. Actually having a firewall is always a good thing.
1
There are tons of practices and patterns that most of us run in to and use to make sure that an outage or failure is either, not impactful at all, not catastrophic, (quickly) recoverable. I am sure that there would have been patterns that they could have implemented, or implemented better. I think as a developer/sysop/architect/designer you need to consider some of the risks and have the manager types handle impact and together determine mitigation.
The big glaring thing here isn't that it happened, but that it was not fixed until after the company lost billions in valuation and future impact will likely add many more billions. I think it is a great example of miss management and a failure to invest in ways to mitigate an outage like this, maybe they thought the risk level was acceptable?
Redundancy may have helped, but it's likely a cost thing. It looks like the updates impacted so much of the network that the redundancy wasn't enough to cover all of what was affected. Some times you have to, as a business (not as an engineer), decide whether it is worth having a fully redundant network up and running along your operational one. Redundancy is high cost and low return. Of course... I bet now they would have eaten the cost or invested more in people and technology to more quickly act and fix, rather than have their valuation tank and a potential merger likely fall through.
1
There everything sucks.
1
There is still so much to learn and so much to do to be able to share accurate information.
1
There is a difference between Jenkins and Jenkins X. Jenkins X is very close to how GitLab works.
1
There is a serious fundamental lack of understanding in the devops communiity about what "declarative" actually is.
People think it's using a configuration language, but it's not. If it builds a DAG, it's declarative
1
there is Ara there, it does the same
1
There is no any chance that you can predict that candidate is good. Only test period will show that. Interview is the way to understand that candidate is sane. Nothing more.
1
There is no difference between BYOD (bring your own device) for devs, and devs having local admin on their devices.
BYOD is regulated, and works very well with zero trust architecture. The old walled garden approach is dead in the security world. We don't trust any device now.
1
There is no such thing as junior devops. Its a role for experiences devs and sysadmins with cloud knowledge and scripting skills
1
There isn’t but I am thinking about how this goes next now so maybe we cover AWS there.
1
There might be exceptions to the rule. If ArgoCD is one, I hope that they have valid reasons to do so. I, personally, would not take anything as "good practice" that comes from the Kubernetes universe without questioning the motives for using this practice first.
But reading the docs, ArgoCD seems to be an IDE that creates declarative pipelines. So far, so good, declarative pipelines are source code.
"\[...\] providing facilities to automatically sync the live state back to the desired target state" reads as "if my build artifacts change, I will adapt the source code accordingly".
For whatever reason this would be a good idea, it is still a separation of code and artifact, so well, do it. If you do it via PR's, there should be no issue with commits that come from other sources (like the developers).
If ArgoCD just uses Git as a database for auto-generated configs, I would have this repo for Argo only and keep my actual source code in a separate one.
1
There were a few actually. We did a very thorough retrospective afterwards which lead to a number of process changes.
The rollout contained too many changes at the same time. The base need was updating to larger instances (z1d.6xlarge IIRC). But we also switched from named instances to autoscaling, which means that a number of old safeguards were missed that would have saved us due to EBS disk snapshots etc.
We also didn't try the exact instance type in our staging experiments, but instead opted for the smaller z1d.3xlarge. Interestingly enough, there was a difference in the way the hardware clocks worked between these sizes, which triggered a fault in our bootstrap scripts. This fault didn't trigger any monitoring.
Hidden deep in the bootstrap scripts was a line that said to retry the puppet run 50 times, and if that failed `shutdown`. For instances in an autoscaling group, that means terminating the instance and starting another. You can see where this is going...
And lastly, since the testing had gone smoothly, we replaced all three master nodes at the same time. They started up and joined the cluster just fine, and everything rolled on nicely until about 1 hour later, when suddenly *all three instances terminated themselves at the same time*. 😳😱
1
There's [Permit.io](https://Permit.io) that manages policies as code (based on OPA and OPAL), while also giving you a UI that generates Rego for you.
1
There's a simple answer - get a job in a company with those practices.
People don't learn entire approaches on their own - you *have* to be in an environment that uses that approach to acquire it.
There are some companies that are just old as fuck and are never going to update - you work for one of these. That's fine, but when you've outgrown such a company you have to move on.
After you've worked at a place with superior practices, and become *extremely* good at those practices, you might be able to become a one man "change the entire system" type of developer, where if you go into a place like your current one you can single-handedly drag them into the future.
But you can't become that without first being the junior member of a team that does that. Go looking for a new job - you obviously have the requisite experience.
1
There’s also Terraform CDK and AWS CDK (obviously aws only for this one) that allow you to use languages like python / c# / etc. Its not a Pulumi only thing.
1
Theres an Amazon brand of compression apparel that is named 'Devops'. Buy him all the pieces with the biggest branding :)
1
Theres dozens of these. Ive recently stumbled upon one called Miro which is nice.
Visio is the more well known - but hasn’t kept up with the times.
1
These are great thoughts, never heard of U2F, thank you!
Aside from the technical side, I am very interested in a "framework". Not only tools, but some **pro-actively** working reminder that will ping me in email or Slack (if I forgot to remove something) that there are some "dangling" credentials that should be cut off.
In other words, how to be sure that I won't forget to remove "AWS IAM roles which you need to remove", since people like you and me usually have a lot of other things on a plate?
1
These are in the UK by the way
https://automationlogic.com/service/joining-the-academy/
https://www.qa.com/learners/become-a-digital-consultant/
https://capgemini-engineering.com/uk/en/careers/
1
These certs are plenty for landing interviews. Honestly if you can get a job as a junior devops and have solid people on your team to learn from you'll break out of that 'junior' level I'm 6 months with your company's tech stack if you put effort to learn
1
These things does automatic let’s encrypt renewal as main perk over nginx.
1
They did not.
1
They do but it’s not the best defense in depth to use their key management.
1
They do! I know of someone who is using it at the scale you mentioned in the OP
1
They do. The practice is called BYOD and is regulated under many security certs including SOC, HIPAA, ISO27001 and in NIST best practices.
A lot of enterprises decide to not implement it due to complexity, but it's very achievable, and works well with zero trust architecture.
In practice, there's no difference in having a dev bring their own device, and giving them local admin on a company device.
Source: I'm my company's security compliance officer, and I navigated a move to zero trust architecture with HIPAA and SOC2 certs.
1
They either have or are currently devising a digital nomad visa. One year extendable to two, no taxes, can open a bank account. I'm sure you can get an attorney to get actual residency after that, then you can get a local job.
1
they have a better explainer video on their homepage: [gettrici.com](https://www.gettrici.com/?utm_source=reddit&utm_medium=post&utm_campaign=devops)
1
They must churn out backlinked blogs all day. No way would personal devices be allowed to run proprietary code where I work.
1
Thing is that DevOps isn't entry level.
You may get in right away with self teaching if you're lucky
But your best bet is to start "lower down" the ladder.
Excuse my poor English hopefully it didn't come off as offensive, English is not me first language
1
Think about it like this..
Would you lose your web dev skills if you switched to embedded software engineering?
What if you switched to data engineering?
The answer is, anytime you switch to a different subset of software engineering (devops included), you will lose out on what you switched from, because it requires different skill-sets.
1
Think you might conflating releases with versions: https://kubernetes.io/releases/
1
thinking people will notice my work and promote me
1
This
1
This assumes duplicating data is going to eradicate outages completely. Nothing you do will accomplish that. It’s more like:
Cost of outages without duplication > cost of outages with duplication + cost of duplication.
Where duplication costs include, but are not limited to additional labor, additional computing resources, additional risk of client data exposure and compromise. It’s hard to imagine a lot of environments where the math works out to justify it.
1
This cluster cost about 20k usd per day when I left. The upgrade to ES 7 took a couple of years in the end, lots of dependencies and tools that needed changing, not to mention *a lot* of parallel testing.
1
This could have easily been to access task manager. Always a good idea to disable it.
1
This gets asked on a daily basis. I recently provided a [suitable answer](https://www.reddit.com/r/devops/comments/vu7x7e/-/ifbubsf).
1
This heavily depends on how you want to go, ansible is a very smart SSH multiplexer, so if you want to manage say nginx configuration files and each server requires individual configuration because it's running a separate site then things become awkward.
One possible solution is to have all the different configuration files in the playbook with the same name as the server and then have ansible scp the files to each server using the servername to indicate which file to source from
Another possible solution for example: you could have a directory in each git repository that includes the configuration for that particular project, then the ansible projet SSHs into the server, checks out the code and moves the configuration file to the right place and does a graceful restart. Extra points if each project defines a fixed set Makefile goals, then your ansible playbook doesn't care about particulars and only executes \`make install\`, \`make configure\`, \`make deploy\` and \`make restart\`
The really important goal, IMHO is to have things in a repo somewhere and from there have an automated way to push those things from the repo to your servers with the least amount of manual work possible
My original message was just suggesting initial steps to start using practical automation and C/IaC. It's easier if you have a small goal in mind because you can focus on that particular goal and have a sense of achievement once you get there, installing monitoring agents for CPU/Disk/Memory is a very good initial goal if you use the same OS on all the VMs
Once you are familiar with the tools you can start branching out and building more complex solutions to automate more things in your environment
That is not to say that I want to cut the conversation short, feel free to reply or DM and I'll be glad to help, with the caveat that I haven't touched ansible in over 3 years so some of my proposals might be outdated by today's standards
1
This is a good opportunity to spend a few months of self-study to learn python. There are TONS of good resources out there, and even a lot of “Python for DevOps” books and courses. Just dive and and you’ll be up to speed faster than you think. Especially if you’re getting opportunities to solve real problems with your new skills.
1
This is actually difficult for most Universities to apply in practice. Most CS / IT / Eng faculty staff are already over burdened between research, supervision, teaching, and administration without finding time to apply Dev(Sec)Ops methodologies to their practical courses. Educating students about the fundamentals that will enable them to pursue interesting careers can already be a challenge. Also, a lot of faculty staff also tend to be more focussed on fundamental or deep research rather than up to date on the latest industry practices.
That said, I do think that University courses should all invest more effort in providing better access to Dev(Sev)Ops tools and a course stream about how to be productive using them. Not even just for the benefit of industry, but to have more productive students with better skills.
1
This is all why good teams are well, good. Good teams every person brings there special something to the table. Good mgmt will use those strengths which unfortunately is also required for a good team. Lot of good mgrs out there, lot of bad ones also. I think most of us know that. OP I’d say do your homework maybe get pluralsight.com and watch videos based off what you are assigned ideally beforehand. You tube but be careful in there. There are others, don’t muck it up. Good luck.
1
This is awesome man. It gives me something to aim for.
1
This is awesome, thanks for sharing
1
This is correct in that you absolutely have to retag the images. There is no way around that that I ever heard of.
As to a tool to help automate this take a look at Google's `crane`:
https://github.com/google/go-containerregistry/blob/main/cmd/crane/doc/crane.md
I found it to be a lot faster, and easier to use, than `skopeo` or plain old `docker`/`podman`
1
this is gold
1
This is interesting.
Short and straight to the point.
Best part is it challenges how I thought it worked.
I ofc have now cleared my Sunday for some hands on funday.
Kudos /u/ARRgentum
1
This is just from my own personal experience, but I found that when I update my LinkedIn profile with just about anything, there is a surge of recruiters reaching out to me for a day or two before the frenzy starts to fizzle out. I feel as if updating your profile shows that your profile is “active” and puts it in front of the unique page that recruiters are presented with.
1
This is not working. I am still able to access the internet from pods.
1
This is old news. Back in the 90s, interviews were based on what Microsoft did. In the 80s, they were based on what IBM did.
1
This is sooo cool. Been trying to install it but getting an error when I try to fetch an API Key, Error: Error sending API request: Post "https://dashboard.api.infracost.io/apiKeys?source=cli-register": dial tcp: lookup dashboard.api.infracost.io on 192.168.0.251:53: read udp 192.168.0.23:62641->192.168.0.251:53: i/o timeout
1
This is the only reason I create packages these days. An example is generated client and server code for protobuf/gRPC stuff. If it's an application I don't bother with packaging anymore and just deliver an OCI image as an artifact. Python packaging is confusing and broken sadly :(
1
This is the way. ES became a lot less scary (still scary to run, I hate running ES) when I started to treat to as a downstream searchable copy of data.
1
This is true when it comes to some soft skills but is very far from the truth when it comes to analytical skills.
A simple programming tests would reveal a lot about a candidate. How they use their everyday tools, communicate, frustration, lost of interests when stuck and bad coding habits are many of the things nobody can hide during an interview.
1
This is very helpful! Thanks
1
This is what I'd ideally prefer. However, one of the reasons I'm using Tornado is because the service requires websockets, and I couldn't get it to work with Flask. Is it possible to use Flask as a reverse proxy and connect directly to Tornado's websockets?
1
This isn't a suitable answer though, the question is targeted at DevOps engineers, what languages do THEY use and why they prefer them. Not what languages are used in DevOps in general.
1
This looks awesome, I might get one for myself!
1
This looks interesting. I’m wondering where this would make sense vs putting your lambdas in a VPC vs just using api gateway.
1
This or Mimir.
1
This post made my Benis hurt.
1
This question has been asked and answered here multiple times. The answer is no and there is a reason no one hires devops "freshers". Devops shouldn't be your first job. I would say it requires at least 3-5 years of experience in real software dev environment or something like server and webhosting provider operations engineer experience to at least consider transition to devops. As you describe yourself as non IT person, you will need to start with entry level IT positions like support and learn your way up by studying, get to junior engineer or junior dev, then learning a lot and after few years you might be ready to apply for devops engineer. Realistic timeline is like 5-7 years from now if you work hard
1
This seems logical to me. If you don’t created records with resources, it’s likely that resources dependent down stream would fail without a secondary deployment to update a central location for DNS.
1
This sounds an awful lot like you're prematurely optimizing your setup? Is your app growing at a rate where you're on a curve and you'll outgrow the capabilities Heroku can provide by X date? Is some part of the apps performance creating a bad user experience? If not, I'd focus on your app and its features until you encounter a performance issue you can't mitigate on the platform you're already successfully deployed on. Working on the infrastructure will not help your app be successful by itself.
This is just general advice - if you're just tinkering around to learn stuff, disregard and have fun. But if you're trying to hustle and make The Next Big Thing, my advice is to focus on The Thing until it's a compelling experience and you really need to grow the infrastructure.
1
This sounds like reinventing the wheel.
1
This sucks man. I really did not want to say it this way, but here it goes.
A lot of companies that were extremely successful as a startup failed in the growth phase, because they did not plan for (work towards) the end state.
This is a statement applicable to all business processes, not just technical. You singled out technology, so I tried to explain the best way I can,
Now if that does not convince you and you don't want to accept that Technology is an important driver in today's startup world, unlike the one in your 15 year old case, then I cannot help you.
The sooner everyone understands that the startup world 15 years ago was very different to today's startup environment, the easier it will be to understand the context.
No offense or disrespect meant for your knowledge or understanding. And sorry for the blunt response.
You need to stop slicing and dicing statements to suit your narrative.
1
This times one million. Small tech heavy companies, especially startups, all have FAANG envy. Like, all of them. To varying degrees of course. I feel like I am constantly reminding people of this.
We don't fucking have Google problems so let's stop trying to build/replicate Google solutions. For example, I hate monorepos and monoliths but I always understand why new companies start out with them. And some platforms are simple/small enough that they never need to even consider deploying microservices.
But whenever someone chimes in with "well Google has a monorepo", I'm always like "yeah, and they have more than two decades of tooling to make it work well for them".
1
This was kind of the answer I had in my head and got from some closer friends in the field. I definitely am not willing to take anything outside of my range, and the range is really only there based on how much of their current services stack I am familiar with, I am willing to take a little less money for the first 6 months to a year while I get incredibly familiar. My thought was once I was progressing services and not just maintaining them then that would be an efficient milestone for me in that particular company's position. I can apply and interview for months, the lowered hours at my current job is a voluntary decision made by me and the owner is working with me very well while I hunt for a different primary job.
1
This was my take, too.
Point 1 could mean just about anything.
Point 2 sounds sketchy but could be nothing.
Point 3 is a big blinking "**You Are Headcount**" sign. If you're not comfortable with the compensation they're offering, feel free to tell them "no thanks" and move on to another opportunity. Especially if you're already working and aren't desperately seeking a position.
1
This.
It's all fun and games until you have to import/migrate a bunch of shit that was held together with bash/cmds
1
This. I use it for so much and none of it is streaming.
1
This. It was full of not only the cringiest and least believable IT tropes "oh you have 90 days to save your department before we outsource them". "IT saves the whole company" etc.. it's like a children's book about some imaginary world of infotech heros while the reality is: https://www.reddit.com/r/sysadmin/comments/vubqe5/today_my_company_announced_that_im_leaving/
I won't even go into how americanized it was.. mysterious zen master character Erik with with his enlightenment superpowers.. and did you know bill was *in the marines*.. Why that gives him the power to solve *anything*..
1
This. It would be incredibly negligent and a breach of trust to allow someone without the proper credentials or justification view company data, infrastructure, etc. If they got caught, best case scenario they get fired. Worst could be potential jail time.
1
This. Just read Accelerate.
1
This.. u/OP you need to look into his comments. Something is missing in your setup.
1
This... I attribute most of my network and web troubleshooting knowledge to keeping Mail flowing for users of small businesses... Do it for a year, and then quickly realize why the general advise for email is to pay someone else to do it (Gmail, Outlook, Proton)
1
Those are not questions but live observation while they are doing the technical tests.
1
Thought it’d be so your devs can have the same version of all your tools as is in production so everything is consistent.
1
Three key ones would be OPA, Checkov or KICS (OPA under the hood). There are others, but depends on what you need i.e language support, custom policies, out of the box framework policies for CIS.
1
Three reasons:
Go is strongly statically checked. Python could run a few hours and then exception out on something the go compiler would have detected at compilation time. Speed isn’t the problem; no static compilation is. At least if you care about reliability.
Single download exe to machines without having to install a full environment.
Implicitly weed out devs who only know python.
1
Three weeks into kubernetes feels like being 8 years old in high school and you don’t speak the language.
Just keep learning and it’ll all click.
1
Three weeks is nothing.
1
Time series databases are optimized for storing and querying series of `(timestamp, value)` samples ordered by `timestamp`. The `value` can be either numeric or an arbitrary structure (for example, a string, a tuple of named strings (aka pre-defined columns), an arbitrary set of `key=value` pairs (aka event or log entry), an arbitrary json, etc.). Some time series databases are optimized only for numeric values (for example, VictoriaMetrics, Mimir, M3DB). Others allow storing arbitrary value types (for example, ClickHouse).
Each series usually has a name (for example, `temperature`) and optional set of `key=value` tags. For example, `temperature{city="London",country="UK"}` can identify a time series with the name `temperature` and two tags: `city="London"` and `country="UK"`. A time series is uniquely identified by its name and tags. If at least a single tag value changes, then new time series is created. For example, `temperature{city="London",country="UK"}` and `temperature{city="London",country="US"}` refer to two distinct time series because they have different values for tag `country`.
Time series workloads usually have the following properties:
- **Huge** number of `(timestamp, value)` samples need to be stored - typically more than a trillion samples (e.g. >1000 billions).
- Samples are constantly ingested into database at a high rate - typical ingestion rate is more than 100.000 samples per second.
- Real-time queries usually select time series by regexp filters on their names and tags, then process samples for the selected series on the selected time range. The number of sample that needs to be processed is typically in the range 1.000.000 - 1.000.000.000 . The expected execution time for such queries is less than a second. Example query: "calculate the average temperature over the last month for all the cities in UK".
While it is possible to use relational database for storing and querying time series data, it may be impractical to do given the time series workload properties mentioned above. E.g. traditional RDBMS will require a **ton** of hardware resources (cpu cores, memory, disk space) for such workloads.
Time series databases are heavily optimized for time series workloads, so they usually require multiple orders of magnitude lower amounts of compute resources compared to traditional RDBMS for the same workloa
1
Times have changed. Devs work from home. They already have their own machines with better desks, keyboards, monitors, and headphones. Everyone wants the freedom to use what they want.
That's why we provide VM remote dev environments, EC2 dev environments, and docker containers for common workflow.
You don't have to use them. It's not forced on anyone. But to sit there and ignore that a workflow that people are telling you is becoming common as absolutely useless to everyone. I just don't understand how you think that way ...
1
Timescale DB is open source so it is expected that end users will be administrating it on their own without commercial support. You can always log issues on Github repo and wait for their replies.
We have used timeseriesDB a lot with bare metal Kubernetes in our projects. If you have someone who with database or data engineering background, they may help you in integration of it. Otherwise, a tool like influxdb enterprise may suit u better.
1
TimescaleDB
1
TimescaleDB on postgres
1
TL;DR: Vagrant -> Ansible -> Docker -> docker-compose -> {Terraform, k8s}
1. Start with Vagrant to spin up VM locally or on any remote server. Use default libvirt provider, then switch to KVM for best performance.
2. Deploy with Ansible something that you have previously deployed manually (I'd recommend nginx). That way you'll be able to get into familiar environment after setup is done and discover how to change the variables to achieve various effects (changed config file or restarted systemd unit).
3. Use docker to build the same service. You can easily switch from previous step by using docker Ansible connection.
4. Find some available docker-compose.yml and spin up the stack consisting of 2 or more applications, like LAMP stack. Use volumes to share the data between containers.
5. Learn a bit of Terraform. Start with the same local VM (you'll find Terraform libvirt provider). Integrate Ansible into Terraform stack to automatically provision the same service.
6. Finally switch to k8s. You'll find a lot of advises how to approach it here, in this sub. I'd recommend Udemy CKAD course from Mumshad Mannambeth.
1
To be honest I could invest my time into Kubernetes and everything I'd need for it and it might be good for the "good to know". But I could invest the same time into learning other tools and learning them better, and I would be much better off with more knowledge I'd say and a much more focused and customised workflow.
So I'm kind of ignoring the existence of Kubernetes now, which makes the journey and start point much more digestible. Starting with making a small lab so I can learn Ansible by building Playbooks to create VM OS templates, provision VMs to spec, and then do initial setup on them so they are ready. That will save me so much time alone, cutting out that manual process. Then I'll be able to easily create those VMs for testing. I was considering Vagrant or Terraform for Proxmox, but I'd rather learn how to use Ansible for that so I can keep within 1 tool.
That's quite funny about VMware, I would have thought they'd created it to solve their own internal problems or make them better and solve others in the process.
1
to be honest if you plan ahead and not let the things get out of hand terraform wins every time. if you are getting into places where your need to program complex thing to simply deploy things you need to step back and have a thing. usually splitting up into smaller projects helps
1
To be honest, I can't remember, it's been quite a while since I've used it. I think in any case cicd is highly opinionated, and there is no right or wrong.
1
To err is human. To propagate that error to every machine in Prod in an automated fashion is DevOps!
1
To my disappointment, only the first third.
1
To succeed in devops you’ll need a good understanding of both sides of IT. Hardware, Virtualization, software, api’s, cloud based saas I mean we could go on forever lol
1
Too much manual work. Trying to figure out a way to automate this, possibly with GitLab CICD.
1
Took 5 months to request the right carts for our application servers. To be fair I was an intern that didn’t know about certs or kubernetes
1
Took down all European websites of a fortune 500 company, due to a fucked up url rewrite file.
This was about 15 years ago.
1
Took down prod by stopping the wrong VMs. We had just migrated some stuff and I noticed right before leaving the office on Friday that the engineer hadn’t shut down the old VMs. I figured I’d do him a solid since it was late in the day and shut them off for him. Yeah. Wrong VMs. 📉. It ended up not being a big deal, but it was embarrassing to say the least.
1
Totally agree. Our stack is reliant on EKS as well as S3, Redshift, RDS Aurora, SQS/SNS, etc. Making all those things vendor agnostic would be an enormous amount of engineering/devops effort to implement and maintain for very little benefit. If Amazon were to go down in every region at once and take our whole infrastructure down globally, the internet as a whole would be totally broken and we've got bigger problems than our app being down for a few hours.
I went down this road before, we were told by management that one of our customers "required us to give the ability to run the stack themselves on-prem". Other customers required us to host a SaaS solution as well, so we had a custom built stack on DC/OS with kafka, postgres, etc. The whole thing was an unstable disaster because everything had to be built from the ground up, in addition to it being stupid expensive to run. After about a year of development the customer came on-site and asked us why we weren't just running using cloud services, then said "We never said anything about requiring this to be on-prem"
The whole company was horribly mismanaged and folded about a year later.
1
Treat the boot camp as a the door being opened to some of the things you need to learn, and not as the step before getting a job as a devops engineer.
As others have mentioned build a home lab and learn about everything you can get your hands on. Break stuff and then fix it. As much as possible flex your Google fu and don’t depend on strangers giving you the answers on stack overflow and Reddit.
Decide on an application you want to code and over engineer it so that it touches all the platforms and tools you want to use and then do all the work to make it happen. Then when you’re comfortable with all that rearchitect it to be efficient.
You could apply for jobs during this time but be upfront about what you can and can’t do. Unless you’re willing to bet rent that you can learn everything on the fly, determine what the real expected job responsibilities are, and if you can do them, during the interview.
1
True, they have really nice keyboard nowadays, and the good ones feel so good to type (some sensory satisfaction) . You could also get him customizable keys for the keyboard ex. https://www.etsy.com/listing/857184051/sandblasted-acrylic-keycap-box-artisan?gbraid=0AAAAADtcfRJSqo42tsKwLVa870igehV88&gpla=1&gao=1&&utm_source=google&utm_medium=cpc&utm_campaign=shopping_us_a-electronics_and_accessories-computers_and_peripherals-keyboards_and_mice-keyboards&utm_custom1=_k_CjwKCAjwiJqWBhBdEiwAtESPaFtFmiulBFYhxpsQqsSNLOwVEOL14amIErLRoSm_wf-_sf25TYbSRhoCr6gQAvD_BwE_k_&utm_content=go_12573079807_124822095292_507896635865_pla-315470422934_m__857184051_253527214&utm_custom2=12573079807&gbraid=0AAAAADtcfRJSqo42tsKwLVa870igehV88&gclid=CjwKCAjwiJqWBhBdEiwAtESPaFtFmiulBFYhxpsQqsSNLOwVEOL14amIErLRoSm_wf-_sf25TYbSRhoCr6gQAvD_BwE
1
True. The money will always be around at least for the foreseeable future.
Try in memory db called https://questdb.io is probably one of the fastest in market.
Storage solution would have to implemented by you.
1
Try influx db
1
Try regula
It uses rego as programming language
But makes it easier to code with the help of predefined modules
1
Try setting the CPU resource limit at a full core.
Otherwise the process gets throttled and health checks fail.
https://engineering.indeedblog.com/blog/2019/12/unthrottled-fixing-cpu-limits-in-the-cloud/
1
Try timeseriesDB which is based on postgres.
1
Trying to do if/else and other logic in a yml / json / xml file. Like a chump.
1
Trying to run before you can crawl. The only "guidance" for threads like this is: learn programming first. DevOps isn't something you just learn from scratch. It is
1
Trying to transition into a DevOps/Cloud role from an IT Support/SysAdmin role. Has anyone made this transition? What allowed you to make the transition stick? I have some familiarity with Python, git, AWS, Azure & Ansible and I'm also learning Terraform currently with Terraform Up and Running. I think I'm struggling mentally because I feel like there is still a lot I need to learn before I can be competent enough to work on something in production environments.
1
Twice I have fucked up git to the point all the developers lost days of work because we had to restore from backup. The worst part of this is that I don't even know what happened, and to this day, a "merge conflict" is like playing with a loaded gun. Most of the time I can fix it, twice it destroyed the master branch and rendered it unusable.
1
Typescript with CDK
1
Typescript, I use Pulumi and it works wonderfully.
1
Typically in a junior position, you would maintain the land instead of setting it all up. I think you're on the right track and if you manage to get done what you are working on, you could easily promote yourself.
1
u/CyberStagist Below are the DevOps Areas I actively work on and try to keep updated with. I hope this helps you understand why I am careful not to flood this with the strategy I follow to keep up with my knowledge areas.
&#x200B;
* Continuous Planning
* Prioritization
* various capabilities within this
* Tracking
* Issue and work tracking
* Continuous Integration
* Version Control
* Development Practices
* Security
* System Architecture
* Continuous Testing
* Test Automation
* Security Testing
* Functional Testing
* Non Functional Testing
* Continuous Integration of automated tests
* Test Environment and Data Management
* Various metrics, defects, etc management
* Continuous Monitoring and Feedback
* Telemetry
* Different types of monitoring and its meaningful usage
* Incident Response Management
* Continuous Deployment and Release Management
* Self Service Portals
* Release Strategies
* Cloud Infrastructure provisioning, Application releases, etc
* Database releases
* Configuration Management
* Automated Deployments, etc
1
u/kiwi833 Learn everything they teach you. Once you are done with the boot camp, identify the one area that you liked the most. Then focus on it and do more courses to increase your knowledge in that area.
1
u/snake_py if you are moving from gitlab, I suggest you do a POC on JenkinsX for the CI/CD part.
I can say from personal experience. this is very similar to Gitlab CI except for the inbuilt Git repository part for GitLab and some basic configuration changes.
But the key is to do a POC and confirm it can fulfill all your needs. Current and planned.
1
Udemy, Udacity, edX, acloudguru, KodeKloud, YouTube, cloud.google.com, AWS docs, literally any computer training company that's been updated in the last five years.
1
Uh but I'm going to need to get promoted next year so let's start that redesign now thanks.
1
Uh, well, I once black-hole'd all root zone DNS queries for an entire country for ~10 minutes or so.. Probably that one.
1
ultimately you're going to need to at least be able to write scripts properly in this career. otherwise you'll get stuck over and over again. once you have syntax and semantics down for a language it's all about just following common patterns to do what you need. which is likely making api calls, iterating over the results and making more api calls based on those results. if that is too much I'd pursue a different career.
if you want to get into SRE land you'll need to be able to read and understand application code and traces.
1
Understand completely. The post is part of my newsletter and the title is appropriate in that context imo but less so as a post on Reddit. However, the rules on this sub state that the title should be the actual title. I'll have to do a better job of coming up with titles that are fitting both for reddit and the newsletter.
1
Unfortunately he just bought himself beats!
1
Updated a system settings file which caused all scripts to fail (typo in the file), which caused about 1200 displays in vending machines to turn bright white and not show anything anymore. Unfortunately, they also didn’t connect to the server anymore, so it took 10 people almost a month to go around the Benelux and plug in a USB stick with a fix for the typo…
It was 2009, so very early narrowcasting era, machines were connected via 3G.
1
Upgraded an ECK operator in prod without properly reading the change log (in my defense, it was a minor version number).
During the upgrade, it deleted all running ES clusters. My PVs were not set to retain and everything was gone :(
1
Use your words.
1
Using Bitbucket with Jira would make sense because they are both maintained by Atlassian. However, Github also supports great project management tools like Notion, Asana.
If you don't have any specific requirements, then Bitbucket/Jira combination is fine. You can also consult the software firm that you're hiring about this. Maybe they have already setup their entire work flows with Github or Gitlab already.
1
Vagrant is great, and you don't need super powerful hardware. If you look at the [repo](https://github.com/ChadDa3mon/infra-ansible) I put in another comment, you'll see you can use VirtualBox on your local laptop. That project basically spins up 6 linux VMs on your local machine, and then uses Ansible to configure them all for their various roles. It was a super fun project to work on, I basically fixed/updated what someone else had done back in 2019 and learned a TON.
Look at the [Vagrantfile](https://github.com/ChadDa3mon/infra-ansible/blob/master/Vagrantfile) to get an idea of how it works, then look at some of the [Ansible Playbooks](https://github.com/ChadDa3mon/infra-ansible/tree/master/artefacts/playbooks) to get an idea of how things work together.
&#x200B;
As for getting hands on some hardware, you really don't need much. If you want something to get started on, look into something like a HP Prodesk 600 Mini from ebay ([Link to direct search](https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2380057.m570.l2632&_nkw=hp+prodesk+600+g4+mini&_sacat=171957)) - they are powerful little boxes for $400 or less. This [one here](https://www.ebay.com/itm/125259786325?hash=item1d2a10a855:g:iHcAAOSwZWJgFHDi&amdata=enc%3AAQAHAAAA4MuIUX%2BmPFEcGE%2F%2FAvLxqPZMFrexhmLyLwAXzf34yGA%2BynN3isp4HjDd40yTBJ3sLwUcRM0IUd%2F5jpmAXNqgG71iBNrzXmRewDo1OubORGxLH6nci6ACV9OKjJD3tcx%2BbCWcforeFvEPxFrEklYDSrMK%2FcRGGB7jk7uSIdBQrPwRkX06jX6Y40ZFSz4a4sfAP%2FNAVUe%2FPb0W7ns0K%2FToWQbOaUbe0rIwbDT5fe8D4uudZLvGK5QLj2xvvnUkPTP4eCrICk2oFN7FTppnJ%2BU9hwL2q11buBLGlItAEy5GHlDs%7Ctkp%3ABFBM9rq-5rtg) is $329 right now and will work great for your needs. You can put proxmox on that and have everything you need to get started. My Ansible/Vagrant project relies on VirtualBox, but it could easily be modified to create the VMs in proxmox.
As for git, I mean just start using it for version control so you're comfortable with things and can easily revert back if you break something. Git is at the core of most "X as code" stuff. You don't need to use Github, Gitlab, Gittea etc, you can simply run git on your local laptop. BUT, having your own git server (github, gitlab etc) will make backing all of this up, AND sharing it with others, possible. Eventually you'll want to get to a place where your systems are pulling code from a known good (Master/Main) branch in a github/gitlab repo. The [example repo](https://github.com/ChadDa3mon/infra-ansible) I mentioned will help show you that as well :)
Also, check out [this article](https://medium.com/design-and-tech-co/end-to-end-automated-environment-with-vagrant-ansible-docker-jenkins-and-gitlab-32bb91fbee40), written by the guy who wrote the original project back in 2019, he did a great job of explaining things.
Lastly, you said:
>I'm using git for some tools and scripts I wrote to better manage some
things, but those tools and scripts require a config file on the local
machine. And those config files are not backed up or under any version
control.
This is exactly where Ansible can come in handy. Get those scripts in a git repo (github/gitlab), then have Ansible push them out to all of your boxes (assuming they are the same script) or have Ansible push out the appropriate script to the appropriate server(s). Ansible has a git module so it can pull in files from a repo, then push them out.
Honestly, if I were you, I would start out as someone else said:
* Create 2 or 3 linux VMs (Ubuntu, Redhat, CentOS - doesn't matter, whatever you're comfortable with).
* Do THE BARE MINIMUM you need to get these servers to a point where you can SSH into them. Don't update packages, don't install things.
* Write an Ansible playbook that will perform the basic tasks you'd normally complete by hand. For example:
* Setup SNMP
* Update packages (apt-get update, apt-get upgrade)
* Install packages you want to have (htop, tcpdump, php, whatever)
* Upload some fake 'scripts' to them
* Avoid the temptation to just run 'shell' commands with Ansible, force yourself to use their modules. If you're like me, you think in terms of shell scripts, and you need to break that habit with Ansible.
* Shell scripts say "Do step 1, then step 2, then step 3. If those steps work, you should arrive at your destination."
* Ansible playbooks say "This is the destination I want, you figure out how to get there". You'll know you've succeeded when you can run an ansible playbook over and over and nothing changes/breaks/errors out. That's what we refer to as [Idempotency](https://docs.ansible.com/ansible/latest/reference_appendices/glossary.html#term-Idempotency) and it's an important concept to grasp.
* Blow away the VMs you just created, and do it all over again. Smile as you now don't have to fear losing a system any longer as you know rebuilding it takes a matter of minutes.
1
Vanilla CUE in most cases at the moment. Dagger is just an app that uses CUE for itself, it won’t help with other configuration.
1
Very true. We need more of each of these to lean more towards the other.
1
VictoriaMetrics is inspired by ClickHouse but specifically designed for timeseries, might be a good option.
1
VictoriaMetrics maybe?
1
VictoriaMetrics should easily handle such a workload (100M series at 115K samples/sec ingestion rate, 10 queries per second) even in a single-node setup. See, for example, [this case study](https://docs.victoriametrics.com/CaseStudies.html#wixcom). If single node isn't enough, then [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) comes to rescue.
1
Violation of security policies is no joke.
1
Vs code dev containers.
1
WAF will prevent a ton of billing in the event of bogus requests from the WWW, I'm sure. Probably worth the $4-5/mo.
1
Warp10 is the best if you want a real TSDB
1
Was it the split brain problem?
1
Was originally intended to be a culture that embraced shared responsibility and enabled velocity through experimentation, systems thinking and constant feedback. https://itrevolution.com/the-three-ways-principles-underpinning-devops/
Then there was SRE as an implementation of DevOps. A role for a person that bridged the gap between dev and Ops, so to speak. There’s a lot to unpack here but save it for later in the discussion.
Suffice to say that what DevOps is to many today, especially as it relates to a role by that title, is really the automation part of SRE on its own. I hate it. I am also seeing specialist SRE teams take on the title of “Observability Engineers” and it seems to me we’ve created more silos rather than breaking down barriers and sharing responsibility.
It is natural, but to me DevOps is still what Gene Kim was talking about. Maybe because I’m old, but also because it made sense. The new stuff just seems like people hopping on a trend to get new job titles and trump themselves up.
1
Was that sarcastic? Didn't get you, sorry.
If it was not, i don't understand how to do anything that you said. How do you learn it? Is there a roadmap?
Op just said that he did automate projects in bash at home ... Now that is a usecase that i think could be interesting to master by yourself.
1
Way too many abstractions than necessary IMO.
1
We actually lost all three master nodes at the same time, and this being ES 1.7, the new masters didn’t read the state back from the cluster but decided that the new truth was nothing. This has been fixed in later versions.
1
We all screw up and sometimes something happens where you can't inform work about it ahead of time or even at the time. As long as it's rare it shouldn't be a problem.
The important thing is to speak to your manager about it promptly, it should bt the same day really. If I was your manager and you randomly didn't show up one day I'd ask why the next time you did, not because I'm mad but because I need to understand what's going on so I can accomodate any needs and make sure you're on track with the stuff that matters (the deliverables).
"Personal reasons" may not be enough of a reason unfortunately. Assuming you work full time you spend a lot of time at your job and it's important that your line manager understands your needs so they can support you. Their main concern is "is this going to continue to affect you in some way?"
You should be able to let them know what's going on if they're a decent and reasonable person. Something as simple as:
> I'm really sorry about yesterday, unfortunately I've been having some sleep problems recently. For some reason I was really struggling to get to sleep, ended up falling asleep really late and completely slept through my alarm and into the day. I've spoken to my doctor about it recently and they've given me some advice but it's not impacted my work before so I've made an appointment to see them again and see if I can get some more help from them.
It's honest, lets them know there's an on-going issue, but reassures them you're doing something about it. It also primes them to be on the look out should something happen again in the future, although hopefully it won't.
I'd also go back to your doctor. Often on these things the first thing they'll do is try to give advice and make sure you're not doing silly things like playing video games until 3am each night.
With things like this it can be useful to keep a sleep diary (time you went to bed, time you think you roughly fell asleep, time you woke up, whether you felt rested after you'd been up for 30mins). This gives your doctor something to go on and acts as some level of data and evidence to show it's not just poor habit but there's actually a continual issue.
The main thing is to be open, honest, and try to ensure it doesn't happen again (although this may be a longer process). Keep your line manager in the loop and they'll appreciate it should something like this happen again.
Also consider a secondary alarm that forces you to get up to turn it off or something!
1
We are doing this by using "EKS security group for pods" mode applied to those pods which don't allow L4 egress by security group rule. Then istio/envoy can be used and the things you need are virtualservice, destinationrule, gateway, etc as well as set up egress gateway (on a seperate nodegroup with taint)
1
We are very heavy users of Jenkins, oh god I hate that java peace of crap. We also have bunch of CI/CD projects in GitLab which I kind of like. Not sure what the issue is with limits, i just did not noticed the message yet probably...
I'm experimenting with Argo workflows, but its like bunch of separate tools to integrate with git, plus you need kubernetes cluster.. and maybe its just me, but customer just do not want to pay for cluster...and somehow expect the same functionality from two standalone servers... maybe we end up run it in minikube or something on single node... what a waste of potential. Maybe Jenkins X, but Im not sure about that one.
1
We call it ClickOps
1
We can say, he beat you to that!
1
We create the zones and generic records as part of our hub creation, records are created along with their resources.
We’re doing almost everything wrong though so maybe don’t follow us. I’d prefer the zones independent of the hub code myself.
1
We definitely give some weight to the ease of onboarding/hiring if we went with the "standard approach" vs rolling our own solution in some spaces. I'm not sure it's actually been the deciding factor in something we've adopted so it must not be a lot of weight though 😂
1
We do something similar with haproxy and ansible to generate the config. Not pretty, but pretty resilient.
1
We had a similar setup (back in the bad ole days of the awful of Helm 2 and the Helm provider even being worse). We moved to a pattern of just provisioning the infrastructure and cluster with Terraform. Then once the cluster is up then use Ansible to install all of the stuff needed until ArgoCD can take over. Has worked out really well for us, your mileage may vary.
1
We have one project with many repositories. Works well but I'm not sure about standards. Maybe if you're big enough you start playing with multiple workspaces for some reason...
1
We have pretty long front-end end-to-end tests. LIke one deployment cycle can take up to 3h of run time. Hence why I need to run in parallel a lot.
1
We implemented it for some clients (turn off, not destroy & rebuild), but because of how little they were off (roughly 8 hours per day) the savings just weren't really there. We realized better savings by right-sizing instances and purchasing reservations, which is where I recommend starting if you haven't already.
1
We need more bullshit definitions.
1
We ran into this recently. You need streaming metrics using kenesis to reduce the latency from AWS -> DD, or else you'll see 10+ minutes minimum.
1
We use a combination of Github Actions and CircleCI. We liked it for our needs. In the past, I had to manage Jenkins instances and hated it.
1
We use Azure Devops for both pipelines and repos. We have a few internal "runner" nodes to handle jobs on the internal network. Pretty solid and we're happy with it over all.
1
We use Jenkins and have our own custom gradle build pipelines for multi-hour builds, they are terrible, since 90% of everything gets done by one gradle function nobody except the original author knows what it actually all does.
1
We use Jenkins but migrating to GitHub Actions. So GitHub Action >> Jenkins
1
We use redgate flyway for versioned schema migrations and repeatable migrations.
https://flywaydb.org/documentation/
The worklow in general is that flyway migrations are created by all (dbas dbes and software devs), they are checked into the softwares git repo, the software is build and deployed to environments via a jenkins pipeline which executes flyway on the respective database. These software builds are promoted to other environments (dev -> qa -> prod etc) and the flyway migration scripts are executed on each env based on what hasn't already been run there. ran scripts are tracked via flyway schema version history table. Gates exist so that only prod scripts can be run by devops engineers during prod deploys. devops engineers run flyway dry run to have dbas review before deployment to prod. dbas only review the flyway scripts that dry run has called out to be runnable on the prod env.
1
We use, in no particular order: Jenkins, Harness, TravisCI, and GitHub Actions.
We're working on migrating off TravisCI as GitHub actions can be used without managing credentials for GitHub or AWS through the built-in GitHub keys and with OIDC for AWS. Also, TravisCI hasn't scaled well for us and we don't yet exceed our included cr dits for GitHub Actions.
We're also, when possible, moving jobs from Jenkins to Harness, the biggest obstacle being very old Oracle builds that rely on a ton of state that works and exists on one of the Jenkins servers. This migration actually adds to our costs, and I personally would prefer to migrate to new Jenkins instances hosted in Kubernetes and running every job from Jenkinsfiles, but I don't want to have that argument with the dev teams more.
1
We used influxdb for this scale and it worked just fine (at least for our use case, which was metrics).
1
We worked in shifts, and got very well compensated in the end. First 24h a lot of the team just worked into the night, and by the next morning we had a clearer picture of what was left to do. We were up and serving the most important data about 12 hours in I think.
1
We would just have tags with specific startup/shutdown times and lambda ran through and shut down/started up instances. Super successful in saving money.
Also created havoc and found edge cases/race conditions with our product so helped increase the resiliency of the overall system.
1
We'll probably have to at some point. This is a newish problem due to rapid expansion along with a few acquisitions. We also have a sprawl problem because people keep creating new AWS accounts, when we really need to create shared sandbox accounts to consolidate things.
1
We're migrating from Jenkins to Bitbucket because the maintenance and security overhead of Jenkins is just not tolerable. We've got only simple apps to deploy to AWS Fargate and some EC2 instances.
I quite like BBPL but I've also never really needed to deploy anything particularly complicated.
I use GitHub actions in some personal projects and like it just a little more than BBPL purely because I can define multiple workflows in different files rather than being stuck with 1 mega file for multiple workflows/steps.
If all you're doing is running some basic checks and deploying an app, pick whichever is easiest to migrate to.
1
We're using Kops to run Kubernetes in AWS. Our CICD systems runs an hourly garbage collector that shuts down sites according to a TTL set during build. At the end of the work day most staging sites are shutdown so all Kubernetes spot worker nodes get terminated, leaving only the masters up.
Next step is to automate shutdown/startup of the masters but then it takes about 10 minutes for Kube to sort it self out the next day.
1
We’re not happy with it but we also don’t hate it, per se.
For us, BB pipelines alone couldn’t get the job done without a crazy amount of hurdles. Also, they’re kind of slow. Or really slow. However, we’ve found ourselves a little spot where they make our lives easier.
We build our backend APIs in Kotlin and deploy as docker containers. Front end is Vue SPAs. Our BB pipelines are set to build for all feature and bugfix branches and will run all of the compile, lint, and unit tests. So when we go to review a PR it will automatically be linked to the latest pipeline build and we’ll know it’s technically sound and can focus on any code/design/fitness review tasks.
We do all the post merge and integration checks on Jenkins as that is what runs our real CI/CD pipelines and has our promotion workflows.
It’s not perfect, but it works pretty well for our small team and we’ve saved quite a bit of effort by not having as many post merge defects to contend with.
1
We’re using Bitbucket now and if we could justify moving, we definitely would. It has incidents weekly, the UI and git commands can be extremely slow at times without cause which is extremely frustrating- it’s displaying of merge conflicts is far worse than Hub or Lab
Welcome to DevOps industry, bro! And Goof Luck for you!
1
welcome to RDD -- Resume Driven Development
1
Welcome to the correct region of the supply demand curve. There is a global shortage of people with that combination of IT skills and problem solving skills right now.
Those of us who were working in tech during the dot com crash will remember seeing decent graduate engineers flipping burgers and tending bar because there were no jobs available for them.
Enjoy every minute the upside. 🙂
1
Well as long as your CI/CD supports shell execution it should be no problem.
We run some vagrant commands with virtualbox hypervisor in our bamboo instance just fine.
1
Well by default the `-f` parameter (forks) is set to 5, so even though there's no small default limit, the blast-radius will be minimized by slower execution due to only running on 5 hosts at a time.
But I circumvented that because I've gotten used to doing a high number of forks (-f40) always because I'm impatient.
1
Well done
1
Well done and Congratulations :-)
1
Well for automation reasons I think linting and code formatting could be done by the pipeline instead of git hooks. This would involve committing to the branch.
1
Well I just got the s3 tutorial up and running in about 10ish minutes. Ive never worked with pulumi but I have worked with terraform. I really like that I can just chose whatever language I am comfortable with and get things provisioned.
1
Well these sounds like good news. I dont mind to improve a bit this skill but i wouldnt feel comfortable if it was more than 10% of my work...thats why i want to find alternate tasks/projects that i can do in this company or ask my manager to involve me on such tasks and offer me education further.
I dont want to be a developer neither to get much involved on coding...
1
Well u/OP is migrating from Gitlab where you use DinD to setup your build environment.
If u/OP continues using the existing DinD from Gitlab, he should not have much of a problem self hosting Jenkins.
The key to GitLab's success was the DinD use rather than anything magical. So I would strong recommend a POC with Jenkins before he rejects or approves it.
1
Well with `crane` it condenses this to a single step:
```
crane copy SRC DST
```
`skopeo` does the same, but I found `crane` easier to get authentication working with.
1
Well written. These are the conclusions of our organization as we said we were agile for a long time before finally understanding what you wrote and actually turning agile.
1
Well, if you don't like dev work, then that's definitely a sign that it may not be worth pursuing. That said, especially with a fairly small sample size, you may actually be conflating one person/company's approach with the field as a whole. With a bit of experience under your belt, dev jobs are almost as good as it gets in terms of stress level and wlb.
I'll push back a bit on one premise: that sysadmin type roles don't have the need to figure out existing systems while simultaneously changing them. This is a huge part of sysadmin work, and imo code actually makes it easier to do this. Code defined systems are much easier to maintain than a gui based tool, and the gui is simply translating your clicks into code. This is also the direction the industry is headed.
Lastly... Consider the pay increase you get with more coding skills. A few years of skill development translates to a massive increase in lifetime income. This allows you to travel, retire early, and still have enough to live comfortably and help others. Is that worth a couple years of skill development?
1
Well, nobody implied that the technical side is only `kubectl rollout`... The issue I see is when you have an architect that doesn't understand the tech or has knowledge how to use it.
Arhitect should be an endgame goal for technical people. Not a starting career.
1
Well, programming is a course in every Computer Science degree... So should be the basic setup of infrastructure. And there is more to it... Like compiler study for languages, there can be IT knowledge taught directly. In an abstract way, you get a whiff, but there are ways to go deeper.
1
Well, some mistakes linger. I still remember drop table on a critical table in production. About the same time ago.
1
Well, what attracts me with Kubernetes is the concept of the master and nodes. You just tell the master to deploy an image and you don't need to care about how, what, why, or even where it is. It will just do it. Perhaps that is what sucked me in because it sounds like an amazing position to be in compared to where things are today.
But as I said I don't do black magic. If I end up using Kubernetes I need to know what it's actually doing so I can leave it to do it's thing. I would feel far worse deploying something in Kubernetes and it works, but I don't personally understand why or how, even if that's part of what it does. I am just too responsible to do that. So before I use it I will learn it, if I feel I need to and I get to that point.
What FOSS/free open source tools do you recommend for logs? ELK, or is there something far simpler to do the same thing? Prefer smething that can be self-hosted rather than cloud focused.
Thank you for your advice.
1
What about Gitlab is worse to GitHub in your opinion?
1
What about YOU? OP is asking what YOU use.
1
What all your cuz learn at the bootcamp, ask him for the tools and practices he gained proficiency in?
1
What are people's thoughts about Argo Workflows?
1
What are the problems with your current approach? It's difficult to recommend anything without knowing why you want to change the process
1
What are you having issues with? I feel like Rancher actually makes Kubernets pretty accessable.
1
What are you on about, I literally dropped my phone as I read your comment. Can you cite your source?
This is how every company I have worked with did it, migrations to an environment in the deployment pipeline, app deployment afterwards. And it is not manual, because of course the deployment pipeline is automated.
1
What are you talking about? I believe that your problem lies not in the environment, but with the terminal emulator.
I have a feeling that you are using Windows (as I've yet to see a 'git bash' anywhere else). Use 'Windows terminal' software; and as for environment? Consider switching to wsl2 and don't look back
1
What Azure certifications do you recommend? Do you have any? What was your study sources? Thank you for your help!
1
What do the developers think? In my opinion, teams should eventually own their pipelines/builds and not just blindly throw code and let you deal with it. So them having a good grasp of how things work and which platforms have a good UX is important.
I would strongly suggest that unless cost are prohibitive, that you should not self-host the platform but maybe have your own runners for better performance / specialized agents. You should not just compare costs for the product, but how much running the platform will cost you in man-hours and how hard it is to use by the average developer.
In addition to that: is your companies' competitive advantage "hosting CI/CD platform" and running tons of builds without spiraling costs?
As said in other comments, bitbucket and atlassian tools are pure garbage. I dont think that anything has changed in the last 12-18 months, strongly suggest that you avoid them too. Same goes for jenkins, its plugin model is really outdated and automating its config / deployment hurts.
1
what does you mean by "additoinal ways to shoot yourself in the foot"? ive seen the term footgun being used on here, just curious. TF all the way for me though
1
what don't you like about bitbucket
1
What exactly are you referring to?
My best guess is that you are creating PEM certificates but Keycloak requires a Java Keystore/Truststore.
If the conversion from pem to jks is what you want to avoid, the official jboss/keycloak image does this automatically if you mount the certificates/key to a certain path.
1
What I like to do is look at the current jobs posted for my role or a step up I.e. if I'm junior I'll look at mid level to senior roles.
Look at the tech stacks they're working with and plan my self education around them.
The method for learning the actual thing is going to be different for everyone so I won't get in to that.
1
What is documentation?
1
What is the appeal of Crossplane over tf/pulumi? Isn't Kubernetes a prerequisite?
1
What kind of attacks? I assume you’re talking about passing through TLS connections to your origin from some load balancer layer or similar. I do not expect any significant security issues, assuming your origin servers are correctly configured to serve TLS traffic and such, just more bothersome to deal with certificates and keys at the origin.
There is no way to pass through the client certs directly if you terminate TLS at your load balancer layer, though you could configure the LBs to attach the client cert or some identifier in the the request headers or similar, and have the origin servers blindly trust the attached cert. Obviously, you need to secure the communications between the LBs and your origin, but you’re probably doing that anyway.
1
What kind of experience you have as a sysadmin? I think about switching too.
1
What kind of skills/training i can ask further from my manager/company about this?
I mean i know Docker and basic Kubernettes and some Linux but in what i should focus on a practical way to study or get coaching from someone senior in the company?
1
What kinds of bugs?
1
What OS do you have on the desktop? Ubuntu server? Or how do you setup that homelab? I’m really interested because I am in the process of building one.
1
What other things can Infrastructure be? Infrastructure as a directory service? Infrastructure as DNS?
1
What the hell are AWS doing? I dont want to start a war, but as the time of writing AKS (on Azure) offers 1.22.6, 1.23.3, 1.23.5 and a preview version of 1.24 :/
1
What they mean by “more foot guns” is that someone might come along and might write code they can’t decipher.
Terraform essentially a knowledge ceiling
1
What will you use to invoke the Lambda? You'd probably need to add extra bits like Kinesis/SQS to get the Lambda to invoke when CF+S3 receive a request, as AFAIK there are no CF/S3 Lambda event sources.
Instead, consider putting Lambda in your request path as your JS app. CF will still serve the HTML+CSS cached from S3, but you can "deploy" with Lambda now, and use the DynamoDB SDK from there.
1
What you're experiencing is normal - kubernetes is at a very low level of abstraction relative to logic apps, function apps and app services.
The services you're used to are, ultimately, just containers in a cluster, exactly like what you having running in kubernetes now. Kubernetes clusters usually represent a transitional state from people moving from an on-premise, self-hosted infrastructure to the cloud.
It's quite likely that what you've run up against is the reason why they're still on kubernetes and possibly the reason why their last SRE couldn't get out of there fast enough - it is bloated and complex app that's hard to work with.
You can immediately dispel the fear of being fired by getting ahead of it and telling your manager that you believe this to be the situation - be honest with him that in order to be effective in this role, you are going to have to take a long time and extensively hit the kubernetes documentation, and that the immediate-term plan for getting out of this situation is to look into migrating the app to the higher-abstraction, easier-to-manage services you're used to.
If they fire you for that - c'est la vie. I think it's far more likely that they're absolutely up shit creek and would actually offer you more money to stay and learn the thing if you threatened to leave.
1
What's IaD? Instead of using terraform, use yaml files?
1
What's super neat about Reddit is the ability to upvote and downvote as a community what they agree with / disagree with. Crazy how you'd think "maybe I'm in the minority here" - you ever stop to wonder why maybe you're in the minority?
1
What's your take on things like DHCP and IP address management? I have an environment with a bunch of servers and switches and nothing else.
1
What's your ultimate goal? If you're trying to configure dns servers why not [configure a nameserver in the network config](https://cloudinit.readthedocs.io/en/latest/topics/network-config-format-v2.html#network-config-v2)?
On Ubuntu this config should use netplan to provide the expected config to Networkmanager or systemd-networkd (depending on which is provided in your image).
Manually touching `resolv.conf` isn't something you likely want/need to do on Ubuntu 20.04
1
what’s CUE?
1
What’s wrong with vault?
1
Whatever is needed.
1
Whatever you can use to automate your task. Preferably something that’s popular with good community support.
1
When I see text above a screenshot and then a screenshot below it that has nothing to do with it I just get massively confused and never think to look at the text below it. Idk, maybe it's just me.
1
When I started, our system used to run on a shared hosting environment where we couldn't load our own cpan packages for Perl, we had to spool them in our local directories. In order to make it work we had to have a boilerplate module in each directory that added this special location as one of the library paths. Basically every directory had one of these.
About 2 weeks in I pushed a project up to production but accidentally pushed it to the root directory. Wouldn't have been a big deal except I clobbered the module I mentioned above with one from my project pointing to the wrong directory in relation to the root, and thus broke the whole system. My boss ended up fixing it and had a good laugh. We still bring it up to this day to the new guys when they get worried they're gonna break stuff.
Oh and I fixed that setup many years ago - that kind of thing isn't possible anymore.
1
when I used to do it golang bash and yaml
1
When my team deployed a pilot 2 years ago we found the biggest issue being anything that deals with encryption / security. Like WAF being a service but not tied to ELBs. Cloudfront required the use of old style certs via IAM instead of ACM and so on.
Also some defaults limits are smaller than everywhere else.
If you’re checking to see if a service is supported, actually test the creation of the service and required option. We’d occasionally run into options that existed in the UI but wouldn’t work at creation or in practice.
Last thing, any code you have that references arns has to take into account the different partition.
We made all of our stuff work, often with a work around, but it wasn’t an afternoon stroll to get there.
1
When only the person who wrote it can understand it.....
1
When using a build environment like Gitlab it's nice to know that your environment is clean and isolated for other build processes by using containers. Relatively new to Terraform, but for node for example it's super sweet to not having to manage multiple versions at once on the same host
1
When we need to do this we ship a new command in our ‘toolbox’ binary that runs some code, this is reviewed via our normal PR process and runs in a container in prod
1
When you factor in headcount cycling and coordination issues with multiple team members, it really depends on what tech debt you want to incur.
Terraform tends to crystallize into a generically brittle and somewhat mismatched declarations with well trodden workarounds to handle provider faults or trying to just get flow control and batch management in a language that doesn't really support it. These problems are pretty universal and ubiquitous.
Pulumi gives you the freedom to create new and innovative forms of tech debt bespoke to your organization prompting an unending desire to just refactor everything.
1
When you pass the CKA go immediately for the CKAD, the CKA is like 70% of it.
1
Whew. Spicy take. Did you know that there's this thing called the open source community, where they collaborate on things related to software, application building, and so on? Or were you under the misguided notion that everyone, everywhere, at every level, just "figures it out on their own"?
1
Which CI/CD system are you using?
1
Which of these two things do you prefer?
The answer you gave was you can’t compare them without also this thing. Which is completely nonsense.
1
Which one did you enroll in?
1
which one? or both?
1
Which part are you confused by? My explanation may not be clear.
I believe the open source version allows you unlimited builds no matter whether you're a corporation or a home user. The limitation is that you can't use runners with it. When compiled with the oss flags, use with runners is disabled.
The Enterprise version allows you to use runners. If you're a home user i don't believe there are any limitations. If you're a corporation i believe you have a maximum number of builds per year before it stops working. Certainly long enough for evaluation purposes. The build limitation may apply for home users too. Not 100% sure on that.
If you're using drone with runners and don't have a license then you're using the Enterprise version in one of the two scenarios i laid out.
1
While I don't fully agree I do think they should CI freshman year and expose them to build and test automation.
1
While this might be true, I have *very* serious doubts that many of those were because of *technical* growth.
Optimising for where you are while leaving *room* for growth is what you should be doing, not operating on the assumption that you're about to turn into the next facebook or google.. after all do remember that the solutions facebook and google use *literally didn't exist* when they grew and needed them. They scaled up with old technology without issue then reworked things to be more efficient as they went.
That's not to say you can't learn from the big boys, but it's important to remember that how those guys do things *now* is *not* how they got where they are today. They got there with good business and marketing, then backed it up with tech.
1
Why are you asking reddit? The whole purpose of doing an exercise like that is to find that out. So write your few lines in Terraform and see what you have at that point.
Short answer is no, it's not that simple. Good news, you can find out yourself. Doing a project like this helps you understand what you don't understand.
1
Why do so much? Instead why not just share the ECR or whatever container repository you have and allow them to pull from it?
I never understood, why we create complex solutions to easy problems?
1
Why do you need to script this when the shell already has the functionality? Could do with an example of what you're trying to accomplish
1
Why don't you just get the CI/CD pipeline to check for/delete these? It already has access to the servers, it's just another step. This is some very easy logic to write.
1
Why don't you just pay for GitLab? Sounds like the easiest solution and probably cheapest solution if you factor in the labor cost for transitioning.
1
Why don't you parameterized the script and add it to all branches? You can then run them conditionally based on the branch name.
1
Why exactly you cannot use CDK ? Its CloudFormation underneath ?
1
Why Go?
I understand it's purpose, but I never found a good reason to use it over Python. Our Go lambdas don't run much faster than our Python ones, doesn't really seem worth the extra engineering effort.
1
Why is it the wrong place or the wrong time?
1
Why not host gitlab on premise?
1
Why not install them?
1
Why not just use cloud resources for that bootstrap node?
1
Why not just use elasticache?
1
Why not use `requirements.txt`? Copy the file, then pip install from that instead.
`pipreqs` will also export version-pinned pip requirements for any project.
1
Why the downvote? At my current job we are using VM for a massive amount of timeseries and it works quite well (not perfectly of course, but nothing does)
1
Why would you generate something from your repo (which implies that this is done automatically) and then commit it back to your repo?
Rule #1 of git: the repo should contain source code only. Never commit build artifacts!
This is how you should do it instead:
\- whatever data is needed to generate the index.html file should be in your repo
\- the script that generates the index.html file should be in the repo
\- your pipeline checks out the repo, runs the script to generate the index.html file and depoloys it wherever it's needed
\- done. No commit of the build resulty back to the repo. Why would you?
1
Why would you say it's unusual? What else would IT support transition to?
1
Why wouldn’t I be able to use the same tag?
1
Why?
If you have EKS clusters running anyway it appears to be unneeded complexity by adding a runner histed on an EC2 instances that spawn ECS Fargate tasks ans is limited to use only that ome image.
It appears a lot easier to add a cirunner namespace and deploy a runner there with the Kubernetes executor. (Thats what we do for our teams, just add a runner to their non-prod cluster and call it a day)
1
Why? Just curious the reasoning. Just bc you're using your cloud platform creds to access it which you also use to deploy etc?
1
Will add one today, thanks for pointing that out!
1
Will you really feel the difference in money? I'd stay if it were me, but I also don't care that much about money, so take what I say with a grain of salt.
1
Win + L has been around since Windows 2000
1
Wiped out all share drive content. No back ups. All gone.
Luckily I work with super old programmers who only used the share drive because they were told too and we only had it up 6 months. There was stuff in it but no one really cared. I got a new one up and now take proper backups.
1
Wish I could afford it. About exhausted all the free trials.
1
Wish I got the joke. Unfortunately I'm *quite* new to this kind of work.
1
With Atatus, you can setup alerts to automatically monitor your infrastructure for downtime and increase in CPU, memory, disk and storage consumption. Get notified through various channels, including Slack, Teams, Email, PagerDuty and more. Get real business value across your server landscape by monitoring the health and performance of your services, hosts, containers and resources.
1
With Gitlab like Docker build environments. using DinD, parallel runs will not be an issue. You just have to make sure you have sufficient hardware to manage your max needs.
1
Woah dude. Where do you work?
1
woooo more blog spam!
1
Word
1
Worked at a shop that insisted on using an app the CEO half wrote then dumped on the team, can confirm.
Though, disagree that nobody cares about it looking like ass or loading slowly. That's only true for the mass market apps that put a big wall called "Sales and Support" between the users and the Devs. When the users are your coworkers... yeah they bitch endlessly.
Not to rag on the OP, I think his intention was more the infrastructure layout, not building the next ADP or w/e, but I do notice a lot of posts in social media where people predict the death of front/back//devops/sysadmins cause <insert code generator> is letting me take my toy app full stack. Always makes me wonder "so... where's your business logic? " The stuff that led to bugs or made the stupid page load slow was never the some CRUD of shipping tuples to JS and back, it was always in the wacky edge case bullshit about Friday lunches being half time if you wore pink socks or your mother's first boyfriend one ate tuna.
1
Workspaces aren't a terragrunt construct. In fact it helps template state configuration so that you keep your code dry. Something you can't do with terraform.
1
Worth checking if the company has a bug bounty program, as this might be in scope and you could earn some money by reporting it
1
Would be great if Ansible let you set a default limit and only expand it with a parameter...
1
would you be open to VC from time to time? maybe like 15mins call once a month? I'm a junior stepping into devops from swe background and would really appreciate getting career-specific advices from a senior. I will pay it forward in the future :)
1
Wow
1
Wow, you sound like me. Are you from the future?
1
WOW! Just wow......I have no words!
1
Wow! That's even better. Let me know if you need a job. We don't have anyone specializing in blockchain synergies currently.
1
Wow. Good for you, man. Also, is DevOps really that satisfying? I mean I've seen software developers do side projects and all "just for fun". I personally love making projects on my own that I'm passionate about.
Is this possible in DevOps? I've always thought of it as something necessary, but bland and uninteresting.
1
write it locally and ngrok that bad boy.
but yea seems about right...
1
Writing IaC in the same lang as your application is a far better choice than static yaml or hcl. The amount of time I saved by using cdks is mindboggling and the ease of integrating with your app code is unparalleled.
1
WSL looks like a good fit. It provides pretty good isolation and you can have graphical interfaces nowadays.
1
Wtf I full well played fifa on my company laptop throughout Covid..
1
Wtf is Infrastructure as Data
1
YAML for CI/CD pipelines and Kubernetes, Terraform, Python, currently learning Go for testing, and some C# for applications.
1
YAML is a data serialization language, stop mentioning it as a programming language. It makes my eye twitch.
And to answer the op’s question: bash and python. I’d like to learn go, at some point.
1
Yaml is programming language for data serialization.
Hope that will stop your eye twitch ;)
https://www.redhat.com/en/topics/automation/what-is-yaml
1
YAML itself really isn’t much though.
Terraform has its own DSL. One could argue being able to use more widely-known/common languages is more portable/easier for new starts to pick up.
I’m not really familiar with Pulumi though and TF is great. Just an idle thought.
1
Yaml, bash, Python. Yaml daily, but the other two sporadically
1
yea I would urge everyone to not freak out. just went through an SDE 2 loop and my questions were way easier than everything this guy had to do. everything I had was arrays/hashing/sliding window type problems, and one binary tree type problem
honestly he got absolutely shafted, especially for devops. but good on him for grinding and delivering under pressure, it’s impressive and he’ll be totally set for interviews he does in the future.
1
Yea sometimes it feels pretty pointless. I ended up implementing k8s for a company even after having emphasized that it was not the biggest bang for buck, and ended up leveraging the k8s experience for a 50% raise at another company a few months later. So it’s in our incentive, as professionals, to stay sharp on high scale technologies even if they’re inappropriate for the company we are working for. So even if, in rejecting the self-interest, you get overruled, you’ve done all you could. You don’t need to kick & scream; I just figure if you do your due diligence in trying to advocate, you’ve covered your bases.
But to your point, I think it’s the right move to opt-out. It’s not a fun & productive situation for anyone to be misaligned on overall priorities. Or rather put, if you don’t think the ship’s headed in the right direction it’s best to let it sail without you.
1
Yea you know it’s funny, I too just had a friend climb aboard Google and have been having similar thoughts. In pursuit of the (it’s a legit path of course) credibility & compensation, she had become so sunk into practicing highly abstract code problems that her practical technical skills & systems-thinking have lagged quite a bit. Personally I don’t think that’s the right path — best to stay abreast of the big picture. At least that’s what I find enjoyable and want to be doing.
1
Yea, I’ve had very little experience with Terraform but vast experience with other similar solutions like Ansible/Jenkins/GitHub actions and I’ll take Pulumi and / or cloud vendors’ SDKs every single time if it’s my choice. This shit becomes so complex and undebuggable so fucking quick. No thanks
1
Yea, my team hasn't moved to m1 macs for the same reason.
1
Yea, though I don’t know who isn’t using k8s nowadays. It provides a truly declarative way of instantiating infra and continuously reconciles that desired state. Think ArgoCD or Flux but for infrastructure.
1
Yeah — the guy trying to sell you some tool.
No wait, nevermind; they don’t use it either
1
Yeah $24K/year difference is not enough to commute to an office lol
1
Yeah absolutely - in essence it's just a reflection of the age-old problem that people who aren't engineers tend not to intuitively adopt an agile approach.
Engineers adapt a solution iteratively as things change, including corporate processes. I've seen this work incredibly well, and I've seen engineering-led companies of both startup and mega-corp size do this with their processes and tech stack to great effect.
But when a company isn't tech-led, they tend to believe you solve problems with lots of rigid rules, and when the rules fail they tend to view this as a political assault by the people saying the rules didn't work, rather than the rule actually failing.
It's not exactly a surprise that a lot of the latter companies fail hard.
1
Yeah gitlab worked fine for my needs. But my employer wants to switch due to the rising costs of the saas solution.
1
Yeah I forgot about this too. We enforce separate user accounts, a 'database owner / create / modify' one and one with 'select / write' grants for the app. I don't understand in what world it is ok or hell, even recommended to give the app more permissions over the database than it should have to perform its actual duty.
1
Yeah I get confused when people say things like that.
It's popular, but it is in no way used everywhere. People seem to tunnel vision in on their jobs being an industry standard... it isn't really, it's just standard anywhere they're qualified to work.
1
Yeah I mean there are tons of case studies on this and big consulting firms that figure this sort of thing out and come up with plans etc. Like you said it comes down to the ROI often coming in when it is too late and even worse: even when it does come out it is mitigation not actual growth. Publicly traded companies are generally growth focused since their shareholders want that, they don't like money going to maintenance etc. especially on this scale. Erm yes those are the same shareholders that just dumped their shares at the first sign of trouble. It is really where governments should come in, but most western ones have been privatizing telecom etc.
That last bit on privatization is part of the issue. At some point the government needed infrastructure and they did not have the expertise/people to build it. So they sold certain rights to those companies to build it out, and those companies now hold those rights. So much of the infrastructure now privatized it is a matter of them consolidating over and over until you have a monopoly. Since it is infrastructure in a finite space, often governed by decades old government contracts, there isn't much a new company can do unless they pay the current monopolies money rights to do so. This is the situation in most western countries to different degrees.
1
Yeah idk if Ansible is the answer in our case, basically it's going to have to run the exact same script, plus or minus, as the provisioner already does + we don't use Ansible anywhere and adding it just for this would not be the best... Glad it worked out for you though!
1
Yeah it is too much money to spend without the receivers input. You’ll get them a 60% board with reds and white keycaps when they actually prefer a 65% with blues and shine through black keycaps.
1
Yeah k8s job is an option we're looking at. Should work / look a bit better.
1
Yeah more like challenges
1
yeah same we use code star connections and github webhooks, but we also use github actions to run ci tests with docker images in some cases .. otherwise we ci in code build or code deploy
1
Yeah that's what I end up doing any way. Books and papers aren't all bad but yeah stuff like "for dummies" or "bible" tend to cover things very broadly and potentially not to the depth you need.
This is me though I find that people learn/absorb/retain information in different ways so reading this type of stuff can be great for some. I know quite a few people who would rely on bible type tech books as their reference rather than using google. Lots of paper bookmarks for navigation :)
1
Yeah to me that's a people problem, the tool is just the weapon.
1
Yeah, good point buddy.
Perhaps it sounds a bit preppy, but are witnessing the failure of societal structures once taken for granted ( Look at Abe's end ).
Why we shouldn't expect the same from companies...their going bust on the stock market.
Also, Trump's tenure showed us how huge companies could be banned in a jiffy.
Why not "edge" our tech bets with OpenStack/Kuberneres as you said?
1
Yeah, I could have been clearer about that. We can't easily bundle the CDK because we're working Go in our CLI and it requires a CLI in Typescript. Sure there are ways we could shell out to it, but it's less than ideal for this scenario.
The alternative would be to use the CDK to just to generate the templates, but already understanding how Cloudformation works, I found the CDK just added more friction. The documentation felt fragmented and not as complete/mature as the docs for Cloudformation. We could write the templates in Python, but it still requires an additional language stack (Typescript/Node) to execute. Having a full Python stack was beneficial to the skillsets we have in-house.
1
Yeah, I meant in addition to that. Read the article I linked and you will see the idea
1
Yeah, I saw an application engineer reading it a couple years ago.
Now he's the director of a new department.
Still doesn't know shit and he and his cronies haven't done anything, but damn, can they fucking put together a powerpoint presentation.
1
Yeah, I’d be pushing back… there are several good ways to securely connect a on-prem network with a VPC (in AWS at least)…
1
Yeah, in my experience it's very common for organizations to decide they want to use XYZ and then backfill the justification with possible benefits that they might see rather than taking measure of what they need to do and developing a solution to that. Especially in the tech stack. Lots of "stakeholders" looking for resume padding to say, "I did this" rather than demonstrating any real consideration around practical concerns resulting in things like "we deployed our PHP WordPress site to Kubernetes!" Unfortunately, because of the keyword driven screening process in hiring, this is unlikely to change. And talking companies/stakeholders off that ledge is exceedingly difficult.
1
Yeah, it's always been like this. When my career first started, everybody was asking obscure brain teasers in interviews like "how many bags of cheetos can you fit in a 747" because somebody from Microsoft said they had success with that approach. Then everybody had to use a nosql database. A few years ago everybody had to have microservices.
There's some value in that kind of thinking though. New people are easier to onboard if the team is doing something they're familiar with; even when that something might be overkill for the business's needs.
1
Yeah, self hosted not necessarily a bad thing though depending on your environment. It all depends.
1
Yeah, the ram was ddr3 so not bad.
My worker nodes have 8g ram each and 4 cpus. The manager has, iirc, 4g and 2 cpus.
My priority is stability and reliability, not performance, so as long as disk i/o doesn’t turn the system to a crawl and the page file doesn’t explode im satisfied so far. Nothing I do needs much cpu anyway.
1
Yeah, this has been the gist of most of the responses, lot of encouragement there.
1
Yeah, try setting up 3 environments and create a pipeline to test and deploy stuff, is super fun and also you learn a lot by doing it
1
Yeah, we just found that since it was a null resource the value of having it in a declarative tool wasn't there. And so the control of a more imperative tool where you could be intentional in what's happening lead to a lot less issues.
We thought that hooking it all up with Terraform would lead to less cognitive load and the ability to pass some values in. But that lead to unwanted delete/recreates which would cause a cascading delete of an ArgoCD app of apps for something that wasn't even changing.
Already the [k8s providers issues](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#stacking-with-managed-kubernetes-cluster-resources) with stuff like that and needing to split your configs is obnoxious enough. So if it's already split we thought we might as well just fully split tooling and call it a day as Kubernetes already is a declarative store of state, and we had a method to sync Git with that state so what we're we gaining with tf.
1
Yeah, you would have to (drastically) alter your update strategy, and my i-know-nothing-about-routing brainfart idea sounds like a lot of effort, but to be honest, with the major outages we have seen due to BGP misconfiguration it is probably worth it investing some time into that.
1
Yeah. The reliance on the stock for compensation is what bums me out. It might work out if I stick around for a long time but not doing me any good in the first several years while it vests.
1
Yeah.. a business failing because of business reasons is a hell of a lot more common than one failing over technical reasons.
1
yeah... you would do those steps with code...
1
Years ago, when I started getting into devops, an acquaintance asked me a question that was essentially, “isn’t that just scripting and setting up AWS?”
In reality, while I’ve certainly gotten much better at writing non-trivial shell scripts, I’ve spent more time coding than I did when I was full-stack simply because I spend much less time in meetings and dealing with non-technical stakeholders (not saying this to denigrate the importance of those things, just that I appreciate not really having to deal with that anymore). I also know next to nothing about AWS because the companies I’ve worked for have their own DCs and, in one case, internal cloud services.
1
Yep and funny enough I am someone who makes videos and courses.
I tend to go through courses or books after I've already been using something for a while. My goal is mainly to pick up best practices and look for spots to refactor and improve what I've done.
My thought process there is while you're learning something you have no idea what best practices are because you don't have experience using the thing you're learning. There's no way for you to have stumbled onto or avoided specific patterns based on real experience.
That's why I also try to focus on the "why" in my videos or courses because to me that's the most important part. The "how" is just syntax, Knowing why something went from A to B to C is really helpful IMO.
1
yep yep yep.
Anytime i need to loop or if in terraform i wish i was in power shell
1
yep,
\`\`\`
Run garbage collection
Garbage collection can be run as follows
bin/registry garbage-collect \[--dry-run\] \[--delete-untagged\] /path/to/config.yml
\`\`\`
https://docs.docker.com/registry/garbage-collection/
1
Yep, came here to post exactly this. While I certainly would not mind people shadowing me, I'm pretty sure my employer would be less than ecstatic about that.
1
Yep, I'm using Packer in a pipeline to create some of our AMIs.
I guess I'm more looking for guidance on EC2 lifecycle, and trying to modernize away from using a chef-client model. I don't think means to swap out for Ansible/Salt/whatever, but leverage some native AWS tools (SSM, Service Catalogue, user-data?) to allow further customization. However, chef is able to do quite a bit of heavy lifting based on policy groups and other environments, and that logic might be difficult to duplicate.
So I'm really trying to back way up and come at this from a fresh approach.
1
Yep, this is the best way to deploy Redis in AWS: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/elasticache_cluster
1
Yep, tt's all about knowing "what" to google, which translates to asking the right questions.
1
Yep! Automating your existing work is priority 1 where possible simply because that then frees up time you can spend improving other aspects of your environment. It also gives you run books you can use for disaster recovery and other more critical circumstances.
1
yep. best get a job as a jr. software engineer then transfer or start in IT then move up
1
Yep. It's the way the pros learn.
1
Yep. My last job was one of those “kitchen sink” type devops/RE jobs, in addition to ops work I also got slapped with cost management of our cloud when they hired me.
This place had NINE worker nodes running in Azure. Wanna guess how many of them had anything deployed to them?
One. And it was a dev app. It’s production counterpart was running natively on a managed App Service Plan.
There were unclaimed volume specs, any/any ingress tables, namespaces built with no pods, deployments or services, expired certificates.
I ask my Director how we got to this state. He didn’t know. I ask “ok, we’ll then what are we solving for by adopting kubernetes?” He didn’t know.
No one knew. A former SRE had convinced everyone it’s what they needed, and then bailed halfway through.
I left that company after three months and went somewhere else.
1
Yes
1
Yes & No. Reading The Goal gave me a deeper understanding of the Theory Of Constraints and in turn that allowed me to explain those concepts in simple terms when discussing our devops implementation battleplan.
1
yes and no
1
Yes by Cloud API i was refering to ssm.
1
Yes definitely you can say that i am outside of the norms lol! I never managed to adapt (at least for long before i roll back) to norms lifestyle
Thank you really for your support and encouraging me to stand up for my self and shaping a life that i feel happy and productive.
1
Yes it is true that since i dont like at all dev work its really questionable if its worth pursuing...
Its also a bit funny because the real good developers in our team are so different than me...I know that not all developers are supposed to have the same hobbies or character but i mean oh my god. The kids are super nerd in mathematics , they love to challenge their self in difficult tasks that require super hard work, they are morning persons and sometimes i feel them that they are too lets say nerdy and predictable...
On the other hand i am a fanatic night owl, i prefer to play a lot VideoGames on a huge TV that i have in home or going to adventures with girls and i prefer to keep the things simple in work without much analytical challenge LOL.
1
Yes it will work and best part is some companies pay arm and a leg to get such a thing.
1
Yes it’s not perfectly secure, just like keeping your Switch at home is no guaranteed it won’t be stolen. Someone can break in either way.
A dent all inbound rule makes that subnet useless then. You’ve pretty much done the equivalent of pulling the plug at that point.
1
Yes that sounds smart and a good approach actually! Though my manager and most of the team members already they know that i come from a very traditional system admin background.
I told to my manager as well few days back when we had this 1:1 meeting (more like a support meeting and congratulating for passing probition period) i told him again that i used to be system admin and he then suggested that in the next few months we ll look together what suits me better.
Still i feel uneasy because lets be honest...doesnt matter how supportive is a company business is business so if the 90% of the work is development i might have a issue in the future...
1
Yes thats true that it depends on the team and the company i work for...So far company is extremely chill (more than any company ever) and i want to keep a place here lol.
I dont like writing code in general. I dont enjoy the whole process of writing code.
Some scripting there and here is ok but no way a proper development cycle.
Even when i was trying to make some personal projects in the past i was enjoying more HTML/CSS3 concepts than the actually programming logic when i moved to Javascript/React. I dont like algorithms and mathematics or the analytical way of thinking of a programmer/developer so thats why i want to be open about this and go after different tasks/projects/carreer path.
1
Yes this is great. I was thinking of tools necessary for making something like this myself.
1
Yes which is why I’m switching back
1
Yes your solution works and functions with dynamo scales really well, so infrastructure wise you will have a responsive app. To make it a bit more production like you should start with an API gateway in front of the lambdas. This can be secured by api keys and even oauth if you want. Waf is a good shout to have a look at next.
After that you should make use of monitoring in cloudwatch or similar. If you do all of this through cloudformation and ci/cd you're golden.
1
Yes-that could possibly work...but it really depends on how future updates are deployed - does it not?
1
Yes, about 5 years ago (or so). We had a partnership with a Chinese company who owned and managed the account. (You need this as you can't, or at least couldn't, just sign up for an AWS China account.) We got it working, but there were a million little challenges. We were using EC2 instances and Chef (not Kubernetes) and ran into problems where various software mirrors were blocked (the Chinese DevOps team we were working with were skilled at getting around these with various proxies), and you can't really do cross-account anything with IAM policies because the Chinese regions aren't federated in any way with the AWS commercial regions, and we had to hack a bunch of stuff in Terraform to get it working.
Many of the Terraform hacks I had to do had tickets opened, and I hope after 5 years that most of those changes were merged but shortly after opening that region, we stopped using it, so I have no idea.
1
Yes, for sure, certs aren’t a guarantee but more a statement that someone is at least knowledgeable in that specific area. I’m looking at Azure certifications as I’ve used a lot of azure back when I was a programmer. I think knowing a script might be useful for use in IaC , I’ll probably look at python as it’s really an “hot” thing right now. Thank you so much for your help!
1
Yes, if your interview takes from 2 weeks to 2 months.
Could you provide some examples of questions or activity, how you test a candidate?
1
Yes, it's a problem. If I learn they're trying to emulate FAANG interviews, I'm not going to waste my time with it and bow out. (High likelihood of rejection along with a lot of effort, no thanks.)
1
Yes, it’s called lateral movement.
Strictly speaking thou, lateral movements do not depend on public or private networks but on a lot more things.
1
Yes, mainly because our source control(bitbucket datacenter) is in a private VPC, and we use security group rules to grant access to the various CI/CDs of all of our teams. Because of this we need predictable IP addresses to whitelist. Services like Azure devops and Bitbucket Cloud have IP blocks that can change weekly, so we have to have our agents for tasks that deploy or configure resources in AWS and can't use Azure Pipelines. Also maintaining the security group rule set is challenging. So we tell teams to send the infrastructure team the nat gateway IP for the master CI/CD instance, and not have every agent poll bitbucket(one team had each k8s pod polling, using unique public IPs and wanted to whitelist 50 IPs addresses).
* One team has a jenkins instance that polls bitbucket for commits, which triggers jobs to spin up containers running on fargate in the intended AWS region to perform deployments(15-25 minute lag just to get the container up and start working on the jobs, but this scales very well).
* Another team uses bamboo to poll bitbucket for commits, which triggers bamboo agents that are always running(faster execution time, but this team only uses 2 AWS regions and almost never has concurrent jobs running).
* A third team, which I'm working with, uses Azure Devops agents hosted in AWS. We're migrating things from cloudformation to terraform, and only need one agent to do multi-region deployments for our infrastructure. The actual builds for the developers run in a hosted azure devops scale set that scales depending on the number of jobs in the queue. There's a 6-10 minute lag on the scaling and we keep at least two agents running at all times.
The biggest challenge is dealing with AWS regions, we don't have that problem with azure. As for applying configurations to VMs/Containers, we use SSM documents if really need to, or have a collection of post deployment scripts in s3, and the VM/Container image is set download and execute for tasks that aren't baked into our gold image. This usually happens when one team proposes a unique change to the image, but its rejected as its not useful for everyone in the org, so they'll bootstrap it from their own s3 buckets. This does require them to properly setup the IAM role to allow access to s3 though.
1
Yes, once you reach that stage, when you are able to manage cattle and no longer pets, you'll be in a good place to move towards containers.
I'm looking forward to read a post where you update us and telll us how you dont have to do 3AM calls any more.
1
yes, simply yes
1
Yes, that makes a lot of sense. I started off getting AWS Certs, learning Python and Linux. Then I read that you should learn additional services well like Lambda, Cloudwatch, Terraform, Boto3, etc for DevOps.
Then I read you should know CICD and Containers at the DevOps level. This is the point where I realized DevOps may be too high of a starting point.
1
Yes, then it's systemd. Just google how to write simple systemd service for ubuntu and you will be set.
You can even look at the docs for how to have systemd restart your script when it detects a file change in it. That way you simply need to upload the new file and let systemd take care of the rest.
It will also make sure it keeps on running and restart it in case of it crashing. (Depending on how you write your systemd service, that is).
1
Yes, they do.
1
Yes, though it depends on the group of people. I've seen a number of folks rush to copy without thinking too deeply about it, or without looking to understand when and why the technology or process is an improvement. In some cases it's just that the person's overwhelmed and feels they have noone to turn to. In other cases it's lack of experience evaluating third party work.
1
Yes, you need to follow the basic installation requirements, or you're not going to have a successful time with it.
https://rancher.com/docs/rancher/v2.6/en/installation/requirements/
You need to find a way for your company to pay for VMs or give you hardware to support your efforts. You shouldn't be doing this out of your own pocket.
1
Yes, you’re missing confidence.
(I’m not kidding. Just start applying, failing , learning , do it again but this time succeed)
1
Yes,that is a good approach. Also, anyrhing that you can bring to the table,need to have a porpouse. I want to do this Prometheus service because it....idk, will help to debug x project, helps this dev team with this task, and also for future projects and centralization.
I think you personal grown is important and is not considered bad that you work on that, but manage still always desire good or usefull results as output of your tasks.
1
Yes!
1
Yes! I cant agree more with you! You are so right. Its like i heard my inner voice speaking LOL.
Especially when you said about how some people dont have the capacity for that kind of abstract reasoning.
Is one of the reasons that i always hated mathematics in school although in other lessons i could consider my self very good student. One of the reasons also that the only thing i enjoyed during my bachelor on IT it was Linux and maybe a bit of IT Security (at least TCP/IP part).
Yes i hope i can find a work suited to this version of "DevOps" that you described. Not writing any application code. The question is still to get some training in my current company about such tasks but being on a project that its mostly development isnt the ideal way to grow up...
Though i believe even my manager has released that about me and maybe he can assist me to get my self into such tasks or different projects in this company at the moment
1
Yes!!
1
Yes.
1
Yes. Also you want to be able to run your build scripts *outside* of your CI system, so you don’t have to run a pipeline to test changes, etc. Each job script in CI should simply call another script that comes from source control.
1
yes. My flow is cloudwatch/kinesis/datadog agents -> datadog -> pagerduty -> statuspage.io. I take care of a lot of AWS and Azure accounts, so I use 3rd party tools to centralize things(I use resource tags to be able to filter).
But also think about the title SRE(site reliability engineer). Looks at potential causes that would limit the reliability of your product, and use automation and tools to fix that. Or go to the developers and request changes to improve the reliability. For compute I look at the triggered alarms, like high CPU, RAM, IOPS, and try to identify the cause or detect a pattern. If I see that I'm only getting these alarms between 6-9pm m-f, that tells me I should adjust the infrastructure to scale out/up just before this time, and scale in/down afterwards. If I get alerts from long running queries in the DBs, I try to understand if this is normal usage, a scheduled task, or something else. Depending on what I find, I may create a bug and assign to the DBs or the developers. Microservice-wise its typically alerts for messages stuck in queue, or in a DLQ, and failing lambdas. Sometimes its a easy fix, where there was a problem upstream, other times it can be app update that broke something. I also look at the tickets that are created for the app support team, and try to see if there's a pattern that can be resolved with automation or if we need to get the developers involved to patch.
1
Yes. And The Goal also
1
Yes. Definitely.
I managed to stop it though in one company, I was brought in as a potential CTO candidate; team was mostly made up of contractors- really smart folks actually.
They told the CEO (with me present) on how to hire engineers, started planning take home tests and leetcode-style testing.
I said basically: “google operates nearby, if someone’s going to go through all that; they’ll go to google, and we can’t pay like google can. We have an opportunity to do things at a human scale. We might not get the best candidates on paper, we might waste a chunk of time, but there’s little better than being a human face- which is what we should be if we are small.”
1
Yes. I have also cert-manager to manage the wildcard.
1
Yet Another Makeup Language
1
You accidentally add a DENY statement that applies more widely than you intended. Only way to fix it is to log in as the root user (instead of an IAM user/role) and update the policy.
1
You act like gatekeeping isnt basically 0 trust policy why is that even a bad thing with this audience.
You leave your firewalls open and say “come on in i wont judge your packets!” I guess.
OP doesnt code or script. Junior, maybe. Devops? In fact what are we talking about him doing here, is he a PM? He just likes the idea if virtualization?
What does OP even do
1
You are doing a great job just for the fact that you want to continue your understanding and grown on technical skills. After you feel comfortable in your environment, try to brings new ideas that can help in your dialy work(or your coworkers) for example, manage metrics, logs, reports, alarm, anything that some other coworker or you can find usefull.
1
You are mentioning lots of good stuff, configuration management will help a lot. Simpler things like monitoring should go first, zabbix is pretty easy to stand up and deploy.
1
you are on the right track. the git push collision is often overlooked or put into a retry loop, which is worse.
look into splitting your code branch from your deployment branch. ex: have devs merging to master and your automation merging master onto a “production” branch.
github actions with concurrency=1 is a good choice, or https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue
but essentially you make it so automation is the only one pushing to that branch.
there are some downsides to this approach, ex: your automation should be limited to files not touching by other branches, like a release version file otherwise you risk a merge conflict.
1
You are solely responsible for the chip shortage!
1
You are wrong, flyway was specifically designed to run migrations that way. Anything else is error-prone and semi-automated at best.
You can safely run migrations from multiple nodes at the same time, flyway is using locks for that.
1
You aren't thinking this through correctly. At the very least you'll have to retag the container image to push it into any other container registry including GitLab's.
1
you bring up an excellent point, i really dont want to have to write the same test twice. I appreciate the serverspec call out - I'll check that out. you've saved me a lot of headache, I appreciate it.
1
You build it. You own it.
How you want to execute on that philosophy is up to the stakeholders to decide what’s best for the business whether it be a separate DevOps team or whatever.
1
you can add a space or comma
event more interesting, add to you namemotenthing like Use\_Bob\_if\_not\_a\_robot Smith
You will be surprised how few people actually read your profile
1
You can also try Rego with Open Policy Agent.
1
You can do gitlab free on an instance. They even have docker containers for that and the runners.
1
You can do that pretty easily with the wrong bucket policy.
1
You can have more than 5 elastic IPs you just need to make a support request.
Unless thats different for govcloud?
1
You can mount the wsl volume in linux.
1
You can put your app behind cloudflare to get ddos protection and rate limiting as well as CDN caching and a bunch of other stuff.
1
You can run your own nat if you don’t like the cost of aws turnkey. I did this when I wanted to log all outbound connections. I think I just did it with iptables and masquerade on a t2 micro.
Note, the nat has to live in the public subnet and you have to do the vpc routing stuff yourself. I don’t recal it being hard.
1
You can see the index.html as a deployment artifact. How I will do it is after it's generated, publish it to some artifact store like artifactory. If you want, you can temporarily rename as your commit I'd before deployment, you will `wget` the file and rename it back to index.html. benefit of this approach is your file retains the same from non prod to prod. Whereas, rerunning the same script on rare occasions, it may generate a different index.html
1
You can store your CI config in a different repo. Gitlab has an option to point to an external CI script (from a different repo). That would ensure all branches are up-to-date immediately, when done correctly.
1
You can understand architecture, and by extension strategy, without needing to remember the syntax of `kubectl rollout`
1
You can use gitlab as well. But as someone who has used both, you may find github actions slightly more intuitive.
They are similar in the fact they help bridge the code with the automation but hoe they go about it is a little different.
If you're more familiar with gitlab, that's definitely an option but if I were to choose, I'd say github actions a slightly more beginner friendly. There also seems to be more examples.
1
You could argue the fix should target the issue generically rather than a specific row ie if a varchar is too long then the pseudo code would be
`update table set column = truncatedString Where len(column) > lengthLimit`
This was it would fix all cases regardless of where it occurs, however I realise that would be difficult in some cases
1
You disagree that Terraform creates a barrier to how expressive you want to be?
1
You do have different ways. But this is not one of them.
The only place an image like this could be justified is in a cicd build system... For anything else, it's bloated and adds extra complexity
1
You don't suddenly forget how to program when you change role. Even if you don't code for a year it wont take you long to get back into it again - it is like riding a bike.
You will get less and different experience with programming in a DevOps role than in a pure software eng role. And it depends a lot on which company you work for _some devops roles are just glorified sysadmins that don't do any programming, others spend a lot of their time building tools for other teams_.
1
You forgot the “bullshit” in front of “security”
1
You get through doors easier. I wouldn’t say I got better offers because of it. This goes for most certs.
1
You got it man! Imposter syndrome is an SOB
Keep grinding on the k8s, ask for help, start documentation as well as you learn for next guy to avoid this.
We all been there!
1
You have better ways to control versions in a shared environment... For example... In terraform you have a version tag you stick in your provisioning file to force a particular version, which is required anyway when you work with tf older than version 1.
Would you like to work in an environment that forces you to use notepad++ as a text editor because that's their standardized tooling.
1
You have my upvote!
1
You hit it exactly. I could not communicate this successfully to people at a previous job: yes, when you were at Google they did things this way but that worked because they had had thousands of engineersbuilding the infrastructure for it for the last fifteen years, whereas here it's just me over the last six months...
I have many dear friends from Google who I respect quite a bit. But I've noticed there's a certain category of people, not all of them, who fall into bad habits there. Anything you want to do, there's literally a world expert on the subject who has already decided on the path, so they get used to not thinking at all. It's easy to just go with whatever's been decided and not think about it any more, but then they get used to being intellectually lazy and have a hard time getting out of it when all of a sudden _they_ are the ones responsible for defining strategy and making decisions.
1
You just know someone is going to put theirs on vinyl because it sounds better to them
1
You know how the Let's Encrypt Oak CT log had an outage back in December 2021? That sucked and caused a follow up event in March 2022.
https://groups.google.com/a/chromium.org/g/ct-policy/c/9fTOoC4UzmE
https://groups.google.com/a/chromium.org/g/ct-policy/c/sdPvvZSp7Rw
1
You live in Colorado, they are legally required to provide you a salary range. I'm not a lawyer and haven't dealt with it directly yet, so I have no idea what to *do* about it, but you may be able to lean on that fact during negotiation.
1
You lost this game before it even started.
"Pushing to prod branches" is a no-go.
Sharing secrets is a even bigger no-go.
Proper ci/cd solves this. Along with a iam roles and service principals.
P.s
The code once downloaded to dev workstation remains there and they can do whatever they want with it.
1
You mean AfyA
1
You mean gitlab self-managed? This was also in the discussion, but my employer and me both had bad experience with hosting our own gitlab instance.
1
You mean in the UI?
1
You mean the ALB ? It never broke. An AWS ALB is not a physical device. It's some software defined network magic stuff done by AWS. It just works. Of cause it could have a bug/outage, but that can happen with any AWS service...
Setting up your own Nginx/HAproxy is 10000% more error prone.
1
You might be doing this every day but the kubernetes provider documentation literally tells you not to
https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#stacking-with-managed-kubernetes-cluster-resources
> When using interpolation to pass credentials to the Kubernetes provider from other resources, these resources SHOULD NOT be created in the same Terraform module where Kubernetes provider resources are also used. This will lead to intermittent and unpredictable errors which are hard to debug and diagnose. The root issue lies with the order in which Terraform itself evaluates the provider blocks vs. actual resources. Please refer to this section of Terraform docs for further explanation.
1
You might loose a bit of an edge. Especially if it comes to oop, algorithms and data structures.
1
you need a license for every Artifactory instance...
1
You need to add and modify your HPA to include some autoscaling parameters. It's hard to know exactly what those settings will be, based on your actual resource utilization. But, you'll add a scaleUp and scaleDown behavior.
As the HPA monitors a resource metric (CPU) it will check that metric holds true for a period of time -- say 30 seconds. Once it does, the HPA triggers a scaling event. This can be based on a policy of adding single pods, or a percentage of pods. When testing, you need to find out how long it takes a new pod to
1) be scheduled and enter 'running' state
2) readiness and liveness checks to both pass
3) pod to be entered into load balancer
4) load balancer to pass traffic to pod, after the service's health check passes
5) traffic increase on pod to cause scaling metric trigger
Your HPA is measuring against the 'request' value of your CPU -- so it will want to trigger and scale at 125m a fairly small amount of CPU. This amount of CPU probably cant support 100 concurrent user requests. There could be application issues, depending on the complexity. But if you just want to try and model pod scaling behavior, you probably want to drastically reduce the scaleUp stabilizationwindow and period seconds to something lower than your liveness timeouts. One wouldn't do this in a production environment, as it is going to cause rapid pod fluctuating too, but for playing around its fine.
1
You need to have a good understanding of the topics I listed below. Those are fundamental tools that you need to be able to work with.
- Linux
- virtualization and hypervisors
- Git
- containers
- generic OS knowledge
- Networking
- Scripting
- Orchestration
- Configuration (infra, application, OS)
From there on you can work [Roadmap](https://roadmap.sh/devops) to pursue whatever discipline you need.
1
You need to really nail the CI/CD part.
If you dont live and breath the the cultural aspect of devops, spend more than one day on that.
1
You only create packages for distribution. Distribution could be allowing others to use it as a package in another project, or allowing yourself / others to deploy it.
The packing approach in these two scenarios will differ due to different needs.
If it's just a project you'll run locally and never deploy to other hardware you won't need to bother packaging it in anyway.
1
You probably need to do some kind of threat analysis to figure out what threats there are, what is worth solving, and how to solve them. Come up with a list of possible threats to the business, then come up with multiple possible solutions to them, and then try and whittle them down into some simple and coherant actions and processes. For example:
Threats:
* Loss of code / IP due to GitHub data loss.
* Customer data breach due to unauthorised access of AWS infrastructure.
* Staff data breach due to unauthorised access of payroll system.
It can be anything like this really, but the main thing is to keep it focused on the areas owned by the team you're doing the exercise with. On the above for example, the developers probably have no responsibility for the payroll system so that won't be relevant. Then you can move onto figuring out whether they have any impact and are worth addressing, and the ones which are worth addressing you can do so. That may be some kind of automation, changing a setting somewhere, or a change to your business processes.
Any team can do this, you don't need to tackle everything at once either. I'd suggest doing a session listing some threats and then assigning each a business impact. After that go away, take out anything that's not worth thinking about and return with fresh eyes to try and understand a way to protect against those threats.
Everything you've listed so far seems sensible, but it'll be worth doing some analysis before jumping into solutions so you don't add unnessesary burdens onto the team. It doesn't need to be expensive or long winded and you can break it out over the course of a few weeks and even do it async if you want.
1
You rely on input from technical staff. You ask your staff for a decision slide deck with pro and CI s and you ask technical staff next to purchasing a d other domains.
It’s not just a technical decision
1
You say you had a unforeseen personal issue, and apologize for the no notice, and you make it up the rest of the week by staying a bit later every day.
If this doesn't become a habit, any manager worth their soft skills salt will give you a one-time pass.
1
You should be building a new golden VM image, and then rolling your Scale Set over to that new image. Scale Sets are not built to deploy new code into them at runtime. They are meant to build an image and scale that image up and down, then deploy a new image with changes, and deploy that image up and down. Each image should be idempotent.
If you are insistent on doing things the way you are talking about, you should use a pull model where the machine boots and then pulls down new code with a cron (or schedule task) which pulls code and bounces proper services. This is for sure a scale set anti pattern and should be used sparingly and only when absolutely necessary.
1
You should be terrified some of those ppl work for Fortune100 and there are no better ppl.
This is best we can work with.
1
You should definitely check out strongDM for the security aspects you talk about.
1
You should do a Udemy course on Docker and Kubernetes
1
You should not be using Vagrant, but Packer for this purpose. Setup autoscaling Runner on a Cloud Provider of your choice (e.g. AWS, GCP, DO). Creatie AMI, Custom Image and ist it as the Source Image for your runner. Your VM/EC2 instance will noemde boot ui in the VM where all dependencies are already installed. You can now download your code and run the test. Pro tip: You can also Build the Packer Image through CI/CD, which helps maintainability.
An alternative approach would be to set up static EC2 instances, where the deps are already installed and then use the shell executor.
That being said, why aren’t you just using Docker? Wouldnt it make your life much easier?
1
You should not have either if you care about security.
1
You should really think about what devops means to you. Is it automation/CICD? Is it offloading the pipelines from developers?
You’ll get questions in the interview that make use you’re on the same page as the company. Have good answers there.
If you want, you can answer me here and I’ll give some constructive feedback. I’m hoping to transition to a more devops focused role soon but I’ve taken to filling that gap on my teams already and love it.
1
You sound like me literally end of 2021. I had a lot of this role dropped in my lap from needs, and didn’t have any sysadmin/desk/IT before this I just used Python to help in my previous unrelated supervisor role.
Stuff was manual for a while, then when I couldn’t handle it Ansible helped greatly. Then as I keeps t wanting to implement automation, the automation was ok, but the underlying approach was off since I didn’t have a good working history to draw from.
Once we were in our feet as a company, I more or less said “you’re going to get Junior reliability out of this (me) unless we have more of a brain pool here”. We’ve hired a part time consultant that is reinforcing some practices he said are good and should stay and he’s helped us move from behaviors that cause tech debt.
Why I drew the line: I can get certs all day, but the company is limited to my experience and that is limited to my time in this role.
Benefits to date: now we have senior guidance on what we’re doing, while I’m still able to drive a lot of the ship. Instead of having anxiety that a solution “could work” once I’m pushing into production with little experience, I simply confirm practices with him regularly and allow him to shape aspects of how we work iteratively.
This way you don’t absolve yourself and write junior in your forehead, while retaining the ability to learn a lot quickly.
If your company doesn’t want something like this, then they’ll have to hire a senior anyway at a certain point if your feelings about this remain the same.
1
You still do all of that
1
You want to learn: Docker, Terraform, k8s.
Learn Docker by reading some existing Dockerfiles for open source projects, then writing your own ones. Google it and search on YT - plenty of materials for it.
Then set up a tiny self-hosted k8s cluster in AWS and play with it.
Here are some links I have readily at hand:
```
# https://kops.sigs.k8s.io/getting_started/aws/
# https://www.technicallywizardry.com/kubernetes-aws-kops-less-than-one-dollar-per-day/
# https://mattjmcnaughton.com/post/reducing-the-cost-of-running-a-personal-k8s-cluster-part-1/
# https://mattjmcnaughton.com/post/reducing-the-cost-of-running-a-personal-k8s-cluster-part-2/
# https://mattjmcnaughton.com/post/reducing-the-cost-of-running-a-personal-k8s-cluster-part-3/
# https://stackoverflow.com/a/66346796/134409
# https://github.com/kubernetes/kops/blob/master/docs/security.md
# TODO: https://github.com/kubernetes-sigs/external-dns
```
and here is [the project in this area that I'm working on](https://github.com/rustshop/rustshop)
1
You would think that we would invest money in speeding up boot time.
Did you hear of neon database? It's a Serverless postgres database technology. I think it scales to zero.
1
You'd be surprised how much easier cdks make things. And I genuinely mean that :)
1
You're entirely correct. As it stands, the pipeline will fail (and do nothing other than log itself) and a new one will run, so the file is created anyway. So releases aren't being impeded right now, I'm just dealing with a really pointless looking build error.
1
You're expecting way too much of yourself, Junior's aren't even really meant to be self-sufficient or "productive" vs the time input to them.
This is the time in your career to acquire as many general skills as you can, and to get good at anything you're not good at. Make the most of it and start doing as many dev tasks as you can to improve at programming.
Programming is just a skill, like any skill it takes time to get good at and if you don't do it for a while you get a bit rusty. IMO most people aren't "talented" at something they just enjoy it and so they do it more, eventually they get good as a result of repeated practice. Even if you don't find as much enjoyment in it as some other people you can still practice just the same and reach the same level as most other people.
If you don't enjoy programming and just want to do sysadmin type things then that's cool but don't throw away the chance to at least reach proficency! Even if you don't want to program much in your next job it'll be a really useful skill and will make life much easier in any technical job in the future. After that you can start looking for sysadmin jobs, but with that you'll likely need to get better with linux. If you really don't want to be doing programming you're going to push yourself out of "devops" which for better or worse always involves some level of programming because:
1) you're often developing tools for others
2) DevOps = Developer & Operations, not just operations
Don't put so much pressure on yourself, you're a junior engineer so don't compare yourself to a senior who has 10 years of experience. Most importantly, when you've been in the industry for 10 years make sure you give juniors the same kind of advice and lots of freedom to try things out and practice all of their skills.
1
You're hitting your server at https://localhost, but localhost isn't a valid name in the ssl cert. If HAProxy also listens on HTTP, change the scheme to that and it will work. If it only listens on HTTPS, you'll need to tell Prometheus to skip certificate verification by adding:
tls_config:
insecure_skip_verify: true
to your scrape config.
1
You're missing the point, I'm on your side..
1
You're not wrong. For an actual functional application, yes - all the features need to be in place. But yeah, this actually is supposed to just be one of those "hey, I know some webdev and backend" projects.
1
You're only as secure as your least secure component and there are continual scanners hitting every AWS related IP looking for open ports. So, yes, your 80/443 boxes might only allow network traffic, but they are on the same network that might allow ssh, sftp, or any of several dozen different services, any of which could have a security hole allowing access to your network.
If the public can reach it and interact with it, it needs to be segregated from hitting anything critical.
1
You're right that this is a concern. I designed the challenge to use usage-priced services in the free tier wherever possible. While that doesn't guarantee anything, we're 27 months into this and I've yet to hear of someone suffering a significant billing surprise during the challenge.
1
You're thinking about naming conventions in far too much detail (imo). Security best practice is that you don't give labels too descriptive ie don't make it easy for bad actors to find systems to attack.
How about just entirely random which ties back to a central index somewhere eg dhebdid is server001 in sweeden
Or swe001, swe002 etc which for more granularity you refer to the configuration management db or other documention.
Simple and practice is always best :-)
1
You're welcome ;-)
1
You’d have a really bad day when there are x+1 days between deploys.
1
You’ll probably want a readiness probe too so you’re not adding unready containers to the mix in addition to what others have suggested. Your licenses probe is getting shot when your application can’t respond because it’s too busy servicing user requests.
I would suggest using async capable frameworks like Django ninja or newer versions of DRF that support async if they’re out - that will help a lot, but at the end of the day won’t solve the problem they’ll just hide it by being more efficient. Your liveness probe issue is expected to still happen during your scale up behavior if your workload is incredibly bursty, I.e. going from 0 to 100 instantly, but the suggestions by others help reduce flapping where availability is caused by scaling up or scaling down activity
1
You’re coming at this from the wrong angle. You want the bare minimum of tools to get the job done reliably and efficiently, not a Tower of Babel built from buzzwords. Forget Kubernetes, 90% of the time I see people use it it’s overcomplicated overkill. You’re clearly a very long way from needing it, if everything has been alright so far on bare VMs. Forget all the shite people pile on top of it too.
I’d start by getting some monitoring in place so you know what’s down before your users do. I like Prometheus/Grafana, but you have options. After that, get to the point where you can consistently build your VMs with ansible. For services you’re deploying, it’s worth looking at packer, which is a wrapper around your hyper visor and provisioning tool (ansible) that makes building VM images easier. That gives you a way to quickly reconstitute your infra, then you need a way to do backups.
Eventually you could dockerise your apps, and use ansible to deploy the images and start the containers, but that’s not a small step to take. That’s going to take a while. By this point, you’ll probably have a better idea of what’s needed going forward.
1
You’re obviously the person behind this tool, which is why you’re being labeled as spam. If you wanna self promote, be honest about it, and participate in other discussions as well.
1
You’re tldr is probably still too vague…
A container image is defined by its Dockerfile (text), the Dockerfile lives in source control (git).
Typically a new docker image is generated and pushed to an image repository (dockerhub, ECR, etc) automatically when changes are detected in git by a CI/CD environment. Your developers can also build and rest this image locally…
The container is then defined in code (e.g. terraform) mentioning the image (defined above), it’s persistent volumes etc. You then have terraform deploy to your housing environment…
1
Your answer is pretty vague, mostly just reciting concepts from the book. I would personally really appreciate if you could give real examples of how you define work, what you do with that in practice, and how has it actually impacted your processes and productivity?
1
Your comment ignores gitops tools like ArgoCD where git is the source of truth for deploying changes.
1
Your definition of "bespoke" software is interesting to say the least and you are definitely parsing words when you shouldn't be. (can use / use(s))
Suffice to say I've done the things you listed in some form or fashion but I would have never, ever have called that "software".
1
Your DevOps team will be able to sort this out. I'd hazard a guess they go with TimescaleDB on Postgres. But again, let the DevOps team sort it out doing DevOps things.
1
Your point is well taken. LOL
1
Your setup follows a common pattern, so I don't think there exists a pattern that is better for all cases. What problems are you running into? What do you feel could improve?
1
Your usual DSL is not any more complicated than an API written for a programming language. They usually are actually simpler to remember, since they are made just for the purpose and don't have to follow some underlying language constraints.
I'm happy I don't have to put `public static void` before each of my resource blocks in TF.
1
Your VCS (Github, GitLab, other...) should provide a release artifact storage system. Use `git tag` to push version tags to your git repository and trigger your CI on tags to create your release artifacts, then upload them to your VCS release artifacts system.
1
Your workflow and efficiency will probably decrease when moving to GitHub/bitbucket.
You obviously have over 5 engineers using the product. Lets say they are 3% less effective , that will cost the company around 15-18k in billable hours.
Don’t take the 2 weeks to migrate, pay the license and use the time to build something that generates revenue. You’re wasting time